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REQUEST FOR FILING APPLICATION UNDER RULE 1.53(bl °\ 




-Pursy^yft/i^CF^ 1.53(b), please file a □ continuation/IE dr sional AttyDkt.: 117-323 

of the pending priorPATENT APPLICATION of: C# M# 

Inventor: HERMON-TAYLOR et al. Date: November 6 2000 

Serial No. 09/091,538 Group: 1645 

Filed: September 16, 1998 Examiner: R. Baskar ^ 5 

For: NOVEL POLYNUCLEOTIDES AND POLYPEPTIDES IN PATHOGENIC MYCOBACTERIA 
AND THEIR USE AS DIAGNOSTICS, VACCINES AND TARGETS FOR CHEMOTHERAPY 
Assistant Commissioner for Patents 
Washington, DC 20231 
Sir: 

This request for filing under Rule 53(b) is made by the following named inventor(s) (using the above-identified title) ' 
Inventor(s): HERMON-TAYLOR et al. 

IEl Attached is a true copy of the prior application as originally filed including the specification, claims, Oath/Declaration 
and drawings (if any) and abstract (if any). No amendments (if any) referenced in the Oath or Declaration filed to 
complete the prior application introduced new matter. 

Kl Priority is hereby claimed under 35 USC 1 1 9 based on the following foreign applications, the entire content of which is 
hereby incorporated by reference in this application: 

Applicatio n Number Country Dav/Month/Year/Filed 

9526178.0 Great Britain 21 December 1995 

PCT/GB96/03221 PCT 23 December 1 996 

Kl certified copy(ies) of foreign application(s) attached or 

:q □ already filed on _inpriorappln.no. filed 

■ n M already filed in 09/091 ,538 filed Septemer16, 1998 

Please amend the specification by inserting before the first line: -- This application claims the benefit of U.S. 
p Provisional Application No. , filed , the entire content of which is hereby incorporated by reference in this 
I ff Application.— 

Epi the prior application is assigned to St. George's Hospital Medical School. 

Kf Power of Attorney has been granted to BJ. Sadoff et al, Reg. No. 36,663 of Nixon & Vanderhye P.C 1 100 N Glebe 
kd., 8 th Floor, Arlington, VA 22201. 

Kf Address all future communications to: Nixon & Vanderhye P.C, 1100 N. Glebe Rd., 8 th Floor, Arlington, VA 22&01. 

P Please amend the specification by inserting before the first line --This is a divisional of application Serial No. 
I* 09/091,538, filed September 16, 1998, now pending, which is a 371 application of PCT/GB96/03221, filed Decewoer 
U 23, 1996 the entire content of which is hereby incorporated by reference in this application.- 

EJ "Small entity" statement of record. □ "Small entity" statement attached. 

Gfl Petition filed in prior application to extend its life to insure co-pendency. 

S The Examiner's attention is directed to the prior art cited in the parent application by applicant and/or Examiner for the 
reasons stated therein. 

Ef Please enter the attached and/or below preliminary amendment prior to calculation of filing fee: 

The entire disclosure of the prior application above-referenced is considered as being part of the disclosure of this 
new application and is hereby incorporated by reference therein. 

FILING FEE IS BASED ON CLAIMS AS FILED LESS ANY HEREWITH CANCELED 

Basic Filing Fee $ 710 00 

Total effective claims 23 - 20 (at least 20) = 3 x$ 18.00 $ 54.00 

independent claims 4 -3 (at least 3) = 1 x$ 80.00 $ 8o!()0 

If any proper multiple dependent claims now added for first time, add $270.00 (ignore improper) $ 270.00 

SUBTOTAL $ 1,114^00 

If "small entity," then enter half (1/2) of subtotal and subtract .$( 0 00) 

SECOND SUBTOTAL $ 1,114 00 

Assignment Recording Fee ($40.00) j 0 00 

TOTAL FEE ENCLOSED $ 1,114 00 
Any future submission requiring an extension of time is hereby stated to include a petition for such time extension. 
The Commissioner is hereby authorized to charge any deficiency in the fee(s) filed, or asserted to be filed, or which 
should have been filed herewith (or with any paper hereafter filed in this application by this firm) to our Account No. 14- 
1140. A duplicate copy of this sheet is attached. 
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In re Patent Application of 



HERMON-TAYLOR et al. 



Atty.Ref: 117-323 



Divisional of Serial No. 09/091,538 



Group: Unassigned 



Filed: Herewith 



Examiner: Unassigned 



For: NOVEL POLYNUCLEOTIDES AND 
POLYPEPTIDES IN PATHOGENIC 
MYCOBACTERIA AND THEIR USE AS 
DIAGNOSTICS, VACCINES AND TARGETS 
FOR CHEMOTHERAPY 



Entry and consideration of the following amendments and remarks are requested. 
IN THE SPECIFICATION : 

Amend the specification as follows. 

Insert the attached Sequence Listing after the claims pages. 
IN THE CLAIMS : 

Amend the claims as follows. 

Cancel claims 2,3, 16 and 17, without prejudice. 

4. (Amended) A polynucleotide in substantially isolated form which encodes a 
polypeptide according to claim 1 [any one of claims 1 to 3]. 



November 6, 2000 



Assistant Commissioner for Patents 
Washington, DC 20231 



PRELIMINARY AMENDMENT 



Sir: 
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8. (Amended) A polynucleotide probe which comprises a fragment of at least 15 
nucleotides of a polynucleotide as defined in claim 5 [any one of claims 4 to 7], optionally 
carrying a revealing label. 

9. (Amended) A recombinant vector carrying a polynucleotide as defined in claim 5 
[any one of claims 4 to 7]. 

10. (Amended) An antibody capable of binding a polypeptide or fragment thereof as 
defined in claim 1 [any one of claims 1 to 3]. 

12. (Amended) A test kit for detecting the presence or absence of a pathogenic 
•3 mycobacterium in a sample which comprises a polynucleotide according to claim 4 [any one of 
^ claims 4 to 8], a polypeptide according to claim 1 [any one of claims 1 to 3], a polypeptide which 
1~ comprises a sequence selected from the sequences of Seq.ID.No: 31, 33, 35, 37 and 39 or a 

■ sue 

polypeptide substantially homogolous thereto, or an antibody according to claim 10 [, any one of 
U claims 10 or 11]. 

Q 13. (Amended) A method of detecting the presence or absence of antibodies in an 

Q animal or human, against a pathogenic mycobacteria in a sample which comprises: 

(a) providing a polypeptide according to [any one of claims 1 to 3] claim 1 or a 
polypeptide which comprises a sequence selected from the sequences of Seq.ID.No: 31, 33, 35, 
37 and 39 or a polypeptide substantially homogolous thereto, which comprises an epitope; 

(b) incubating a biological sample with said polypeptide under conditions which allow 
for the formation of an antibody — antigen complex; and 

(c) determining whether antibody-antigen complex comprising said polypeptide is 

formed. 

14. (Amended) A method of detecting the presence or absence of a polypeptide 
according to [any one of claims 1 to 3] claim 1 or a polypeptide which comprises a sequence 
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selected from the sequences of Seq.ID.No: 3 1, 33, 35, 37 and 39 or a polypeptide substantially 
homogolous thereto in a biological sample which method which comprises: 

(a) providing an antibody according to claim 10 [any one of claims 10 and 1 1]; 

(b) incubating a biological sample with said antibody under conditions which allow for 
the formation of an antibody-antigen complex; and 

(c) determining whether antibody-antigen complex comprising said antibody is formed. 
15. (Amended) A method of detecting the presence or absence of cell mediated immune 

reactivity in an animal or human, to a polypeptide according to claim 1 [claims 1 to 3] or a 
polypeptide which comprises a sequence selected from the sequences of Seq.ID.No: 31, 33, 35, 
37 and 39 or a polypeptide substantially homogolous thereto, which method comprises 

(a) providing a polypeptide according to claim 1 [any one of claims 1 to 3] or a 
polypeptide which comprises a sequence selected from the sequences of Seq.ID.No: 31, 33, 35, 
37 and 39 or a polypeptide substantially homogolous thereto, which comprises an epitope; 

(b) incubating a cell sample with said polypeptide under conditions which allow for a 
cellular immune response such as release of cytokines or other mediator or reaction to occur; and 

(c) detecting the presence of said cytokine or mediator or cellular response in the 
incubate. 

18. (Amended) A method of treating or preventing mycobacterial disease in an animal 
or human caused by mycobacteria which express a polypeptide according to [claims 1 to 3] 
claim 1 or a polypeptide which comprises a sequence selected from the sequences of Seq.ID.No: 
3 1, 33, 35, 37 and 39 or a polypeptide substantially homogolous thereto, which method 
comprises vaccinating or treating an animal or human with an effective amount of said 
polypeptide. 
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19. (Amended) A method of treating or preventing mycobacterial diseases in animals or 
humans caused by mycobacteria containing the polynucleotide of Seq.ID.No: 3 or 4, which 
method comprises vaccinating or treating an animal or human with an effective amount of a 
polynucleotide according to claim 4 [claims 4 to 7], a vector according to claim 9 or a 
polynucleotide which encodes a polypeptide which comprises a sequence selected from the 
sequences of Seq.ID.No: 31, 33, 35, 37 and 39 or a polypeptide substantially homogolous 
thereto. 

20. (Amended) A method according to claim 18 [claims 18 or 19] for increasing the in 
vivo susceptibility of mycobacteria to antimicrobial drugs. 

21. (Amended) A normally pathogenic mycobacterium, whose pathogenicity is 
mediated in all or in part by the presence or the expression of a polypeptide as defined in [any 
one of claims 1 to 3] claim 1 or a polypeptide which comprises a sequence selected from the 
sequences of Seq.ID.No: 31, 33, 35, 37 and 39 or a polypeptide substantially homogolous 
thereto, which mycobacterium harbours an attenuating mutation in a gene encoding one of the 
said polypeptides. 

REMARKS 

The claims have been amended to reduce the filing fees and delete improper multiple 
dependencies. 

The specification has been amended to include a Sequence Listing, a copy of which was 
filed in the parent Application No.09/091,538. The attached paper copy of the Sequence Listing 
is the same as the paper and computer readable copies of the Sequence Listing submitted in 
Application No. 09/091,538. The Office is requested to use the computer readable form of the 
Sequence Listing in the parent Application No. 09/091,538, in the present application. A 
separate Request to this effect is attached. No new matter has been added. 
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An early and favorable Action on the merits is requested. 



BJS:rdw 

1 100 North Glebe Road, 8th Floor 
Arlington, VA 22201-4714 
Telephone: (703) 816-4091 
Facsimile: (703) 816-4100 



Respectfully submitted, 



NIXON & VANDERHYE P.C. 




\^BJ. Sadoff 
Reg. No. 36,663 
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Novel polynucleotides and polypeptides in pathogenic mycobacteria 
and their use as diagnostics, vaccines and targets for 
chemotherapy. 

This invention relates to the novel polynucleotide sequence we 
5 have designated "GS" which we have identified in pathogenic 
mycobacteria. GS is a pathogenicity island within 8kb of DNA 
comprising a core region of 5.75kb and an adjacent transmissable 
element within 2.25kb. GS is contained within Mycobacterium 
para tuberculosis , Mycobacterium avium subsp . silvaticum and some 
10 pathogenic isolates of M. avium. Functional portions of the core 
region of GS are also represented by regions with a high degree 
of homology that we have identified in cosmids containing genomic 
DNA from Mycobacterium tuberculosis. 

15 Background to the invention 

Mycobacterium tuberculosis (Mtb) is a major cause of global 
diseases of humans as well as animals. Although conventional 
methods of diagnosis including microscopy, culture and skin 
testing exist for the recognition of these diseases, improved 

20 methods particularly new immunodiagnostics and DNA-based 
detection systems are needed. Drugs used to treat tuberculosis 
are increasingly encountering the problem of resistant organisms. 
New drugs targeted at specific pathogenicity determinants as well 
as new vaccines for the prevention and treatment of tuberculosis 

25 are required. The importance of Mtb as a global pathogen is 
reflected in the commitment being made to sequencing the entire 
genome of this organism. This has generated a large amount of 
DNA sequence data of genomic DNA within cosmid and other 
libraries. Although the DNA sequence is known in the art, the 

30 functions of the vast majority of these sequences, the proteins 
they encode, the biological significance of these proteins, and 
the overall relevance and use of these genes and their products 
as diagnostics; vaccines and targets for chemotherapy for 
tuberculous disease, remains entirely unknown. 

35 Mycobacterium avium subsp. si Ivati cum (Mavs) is a pathogenic 
mycobacterium causing diseases of animals and birds, but it can 



* WO 97/23624 PCT/GW&m:'^. 

li- 
aise affect humans . Mycobacterium paratuberculosis {Mptb) causes 
chronic inflammation of the intestine in many species of animals 
including primates and can also cause Crohn' s disease in humans . 
Mptb is associated with other chronic inflammatory diseases of 
5 humans such as sarcoidosis. Subclinical Mptb infection is 
widespread in domestic livestock and is present in milk from 
infected animals. The organism is more resistant to 
pasteurisation than Mtb and can be conveyed to humans in retail 
milk supplies. Mptb is also present in water supplies, 

10 particularly those contaminated with run-off from heavily grazed 
pastures. Mptb and Mavs contain the insertion elements IS900 and 
IS902 respectively, and these are linked to pathogenicity in 
these organisms. IS90Q and IS9Q2 provide convenient highly 
specific multi-copy DNA targets for the sensitive detection of 

15 these organisms using DNA-based methods and for the diagnosis of 
infections in animals and humans. Much improvement is however 
required in the immunodiagnosis of Mptb and Mavs infections in 
animals and humans. Mptb and Mavs are in general, resistant in 
vivo to standard anti- tuberculous drugs. Although substantial 

20 clinical improvements in infections caused by Mptb, such as 
Crohn's disease, may result from treatment of patients with 
combinations of existing drugs such as Rifabutin, Clarithromycin 
or Azithromycin, additional effective drug treatments are 
required* Furthermore, there is an urgent need for effective 

25 vaccines for the prevention and treatment of Mptb and Mavs 
infections in animals and humans based upon the recognition of 
specific pathogenicity determinants. 

Pathogenicity islands are, in general, 7-9kb regions of DNA 
comprising a core domain with multiple ORFs and an adjacent 

30 transmissable element. The transmissable element also encodes 
proteins which may be linked to pathogenicity, such as by 
providing receptors for cellular recognition. Pathogenicity 
islands are envisaged as mobile packages of DNA which, wher. th^y 
enter an organism, assist in bringing about its convert ion £.?e.a 

35 a non-disease-causing to a disease-causing strain. 



Description of the Drawings 
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Figure 1(a) and (b) shows a linear map of the pathogenicity 
island GS in Mavs {Fig la) and in Mptb {Fig lb) . The main open 
reading frames are illustrated as ORFs A to H. ORFs A to F are 
found within the core region of GS . ORFs G and H are encoded by 
5 the adjacent transmis sable element portion of GS. 

Disclosure of the invention 

Using a DNA-based differential analysis technology we have 
discovered and characterised a novel polynucleotide in Mptb 
(isolates 0022 from a Guernsey cow and 0021 from a red deer) . 

10 This polynucleotide comprises the gene region we have designated 
GS. GS is found in Mptb using the identifier DNA sequences 
Seq.ID. No 1 and 2 where the Seq.ID No2 is the complementary 
sequence of Seq.ID No 1. GS is also identified in Mavs. The 
complete DNA sequence incorporating the positive strand of GS 

15 from an isolate of Mavs comprising 7995 nucleotides, including 
the core region of GS and adjacent transsmissable element, is 
given in Seq.ID No. 3. DNA sequence comprising 4435 bp of the 
positive strand of GS obtained from an isolate of Mptb including 
the core region of GS {nucleotides 1614 to 6047 of GS in Mavs) 

20 is given in Seq.ID No 4. The DNA sequence of GS from Mptb is 
highly (99.4%) homologous to GS in Mavs. The remaining portion 
of the DNA sequence of GS in Mptb, is readily obtainable by a 
person skilled in the art using standard laboratory procedures. 
The entire functional DNA sequence including core region and 

25 transmisable element of GS in Mptb and Mavs as described above, 
comprise the polynucleotide sequences of the invention. 

There are 8 open reading frames (ORFs) in GS. Six of these 
designated GSA, GSB, GSC, GSD, GSE and GSF are encoded by the 
core DNA region of GS which, characteristically for a 

30 pathogenicity island, has a different GC content than the rest 
of the microbial genome. Two ORFs designated GSG and GSH are 
encoded by the transmissable element of GS whose GC content 
resembles that of the rest of the mycobacterial genome. The ORF 
GSH comprises two sub -ORFs H x H 2 on the complementary DNA strand 

35 linked by a programmed f rameshif ting site so that a single 
polypeptide is translated from the ORF GSH. The nucleotide 
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sequences of the 3 ORFs in GS and their translations are shown 
in Seq. ID No 5 to Seq.ID No 29 as follows: 

ORF A: Seq. ID No 5 Nucleotides 50 to 427 of GS from Mavs 

Seq. ID No 6 Amino acid sequence encoded by Seq.ID No 
5 5. 

ORF B: Seq. ID No 7 Nucleotides 772 to 1605 of GS from Mavs 
Seq. ID No 8 Amino acid sequence encoded by Seq.ID No 
7. 

ORF C: Seq. ID No 9 Nucleotides 1814 to 2845 of GS from Mavs 
10 Seq. ID No 10 Amino acid sequence encoded by Seq.ID No 

9. 

Seq. ID No 11 Nucleotides 201 to 1232 of GS from Mptb 
Seq. ID No 12 Amino acid sequence encoded by Seq.ID No 
11 

5 ORF D: Seq. ID No 13 Nucleotides 2785 to 3804 of GS from Mavs 
Seq. ID No 14 Amino acid sequence encoded by Seq.ID No 
13 . 

Seq. ID No 15 Nucleotides 1172 to 2191 of GS from Mptb 
Seq. ID No 16 Amino acid sequence encoded by Seq.ID No 
20 15. 

ORF E: Seq. ID No 17 Nucleotides 4080 to 4802 of GS from Mavs 
Seq. ID No 18 Amino acid sequence encoded by Seq.ID No 
17. 

Seq. ID No 19 Nucleotides 2467 to 3189 of GS from Mptb 
25 Seq. ID No 20 Amino acid sequence encoded by Seq.ID No 

19. 

ORF F: Seq. ID No 21 Nucleotides 4947 to 5747 of GS frofo Mavc 
Seq. ID No 22 Amino acid sequence encoded by Seq.TT' Ho 
21. 

30 seq. ID No 23 Nucleotides 3335 to 4135 of GS from ISptb 

Seq. ID No 24 Amino acid sequence encoded by Seq.ID No 
23. 
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ORF G: Seq. ID No 25 Nucleotides 6176 to 7042 of GS from Mavs 
Seq. ID No 26 Amino acid sequence encoded by 
Seq. ID No 25. 

ORF H: Seq. ID No 27 Nucleotides 7953 to 6215 from Mavs. 

ORF Hit Seq. ID No 28 Amino acid sequence encoded by 
nucleotides 7953 to 7006 of Seq. ID No 27 

ORF H 2 : Seq. ID No 29 Amino acid sequence encoded by 
nucleotides 7009 to 6215 of Seq. ID No 27 

The polynucleotides in Mtb with homology to the ORFs B, C, E and 
F of GS in Afptib and Mavs, and the polypeptides they are now known 
to encode as a result of our invention, are as follows: 

ORF B: Seq. ID No 30 Cosmid MTCY277 nucleotides 35493 to 
34705 

Seq. ID No 31 Amino acid sequence encoded by Seq. ID 
No30. 

ORF C: Seq. ID No 32 Cosmid MTCY277 nucleotides 31972 to 32994 
Seq. ID No 33 Amino acid sequence encoded by Seq. ID 
No32. 

ORF E: Seq. ID No 34 Cosmid MTCY277 nucleotides 34687 to 33 956 
Seq. ID No 35 Amino acid sequence encoded by Seq. ID 
No34. 

ORF E: Seq. ID No 36 Cosmid MT024 nucleotides 15934 to 15203 
Seq. ID No 37 Amino acid sequence encoded by Seq. ID 
No36. 

ORF F: Seq. ID No38 Cosmid MT024 nucleotides 15133 to 14306 
Seq. ID No 3 9 Amino acid sequence encoded by Seq. ID 
No38. 

The proteins and peptides encoded by the ORFs A to H in Mptb and 
Mavs and the amino acid sequences from homologous genes we have 
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discovered in Mtb given in Seq.ID Nos 31, 33, 35, 37 and 39, as 
described above and fragments thereof, comprise the polypeptides 
of the invention. The polypeptides of the invention are believed 
to be associated with specific immunoreactivity and with the 
5 pathogenicity of the host micro-organisms from which they were 
obtained. 

The present invention thus provides a polynucleotide in 
substantially isolated form which is capable of selectively 
hybridising to sequence ID Nos 3 or 4 or a fragment thereof. The 

10 polynucleotide fragment may alternatively comprise a sequence 
selected from the group of Seq.ID. No: 5, 7, 9, 11, 13, 15, 17, 
19, 21, 23, 25 and 27. The invention further provides a 
polynucleotide in substantially isolated form whose sequence 
consists essentially of a sequence selected from the group Seq 

15 ID Nos. 30, 32, 34, 36 and 38, or a corresponding sequence 
selectively hybridizable thereto, or a fragment of said sequence 
or corresponding sequence. 

The invention further provides diagnostic probes such as a probe 
which comprises a fragment of at least 15 nucleotides of a 
20 polynucleotide of the invention, or a peptide nucleic acid or 
similar synthetic sequence specific ligand, optionally carrying 
a revealing label. The invention also provides a vector carrying 
a polynucleotide as defined above, particularly an expression 
vector . 

25 The invention further provides a polypeptide in substantially 
isolated form which comprises any one of the sequences select 
from the group consisting Seq. ID. No: 6, 8, 10, 12, 14, 16, 18, 
20, 22, 24, 26, 28, 29, 31, 33, 35, 37 and 39, or a polypeptide 
substantially homologous thereto. The invention additionally 

30 provides a polypeptide fragment which comprises a fragment of a 
polypeptide defined above, said fragment comprising at least 10 
amino acids and an epitope. The invention also provides 
polynucleotides in substantially isolated form which encode 
polypeptides of the invention, and vectors which comprise such 

35 polynucleotides, as well as antibodies capable of binding such 
polypeptides. In an additional aspect, the invention provides 
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kits comprising polynucleotides, polypeptides, antibodies or 
synthetic ligands of the invention and methods of using such kits 
in diagnosing the presence or absence of mycobacteria in a 
sample. The invention also provides pharmaceutical compositions 
5 comprising polynucleotides of the invention, polypeptides of the 
invention or antisense probes and the use of such compositions 
in the treatment or prevention of diseases caused by 
mycobacteria. The invention also provides polynucleotihe 
prevention and treatment of infections due to GS-containing 
10 pathogenic mycobacteria in animals and humans and as a means of 
enhacing in vivo susceptibility of said mycobacteria to 
antimicrobial drugs. The invention also provides bacteria or 
viruses transformed with polynucleotides of the invention for use 
r5 as vaccines. The invention further provides Mptb or Mavs in 

15 which all or part or the polynucleotides of the invention have 
iHi been deleted or disabled to provide mutated organisms of lower 

i|] pathogenicity for use as vaccines in animals and humans . The 

!^ invention further provides Mtb in which all or part of the 

polynucleotides encoding polypeptides of the invention have been 
20 deleted or disabled to provided mutated organisms or lower 
] J™ pathogenicity for use as vaccines in animals and humans . 



A further aspect of the invention is our discovery of homologies 
between the ORFs B, C and E in GS on the one hand, and Mtb cosmid 
MTCY277 on the other (data from Genbank database using the 

25 fcomputer programmes BLAST and BLIXEM) . The homologous ORFs in 
MTCY277 are adjacent to one another consistent with the form of 
another pathogenicity island in firth, A further aspect of the 
invention is our discovery of homologies between ORFs E and F in 
GS, and fifth cosmid MT024 {also Genbank, as above) with the 

30 homologous ORFs close to one another. The use of polynucleotides 
and polypeptides from fifth (Seq. ID Nos 30,31, 32, 33, 34, 35, 36, 
37, 38 and 39} in substantially isolated form as diagnostics, 
vaccines and targets for chemotherapy, for the management and 
prevention of fifth infections in humans and animals, and the 

35 processes involved in the preparation and use of these 
diagnostics, vaccines and new chemotherapeutic agents, comprise 
further aspects of the invention. 
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A. Polynucleotides 



Polynucleotides of the invention as defined herein may comprise 
DNA or RNA. They may also be polynucleotides which include 
5 within them synthetic or modified nucleotides or peptide nucleic 
acids. A number of different types of modification to 
oligonucleotides are known in the art. These include 
methylphosphonate and phosphorothioate backbones, addition of 
acridine or polylysine chains at the 3' and/or 5' ends of the 

10 molecule. For the purposes of the present invention, it is to 
be understood that the polynucleotides described herein may be 
modified by any method available in the art. Such modifications 
may be carried out in order to couple the said polynucleotide to 
a solid phase or to enhance the recognition, the in vivo 

15 activity, or the lifespan of polynucleotides of the invention. 

A number of different types of polynucleotides of the invention 
are envisaged. In the broadest aspect, polynucleotides and 
fragments thereof capable of hybridizing to SEQ ID NO: 3 or 4 form 
a first aspect of the invention. This includes the 
20 polynucleotide of SEQ ID NO: 3 or 4. Within this class of 
polynucleotides various sub-classes of polynucleotides are of 
particular interest . 

One sub- class of polynucleotides which is of interest is the 
class of polynucleotides encoding the open reading frames A, B, 

25 C, D, E, F, G and H, including SEQ ID NOs:5, 7, 9, 11, 13, 15, 
17, 19, 21, 23, 25 and 27. As discussed below, polynucleotides 
encoding ORF H include the polynucleotide sequences 7953 to 7006 
and 7009 to 6215 within SEQ ID NO: 27, as well as modified 
sequences in which the frame-shift has been modified so that the 

30 two sub-reading frames are placed in a single reading frame. 
This may be desirable where the polypeptide is to be produced in 
recombinant expression systems. 

The invention thus provides a polynucleotide in substantially 
isolated form which encodes any one of these ORFs or combinations 
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thereof. Combinations thereof includes combinations of 2, 3, 4, 
5 or all of the ORFs. Polynucleotides may be provided which 
comprise an individual ORF carried in a recombinant vector 
including the vectors described herein. Thus in one preferred 
5 aspect the invention provides a polynucleotide in substantially 
isolated form capable of selectively hybridizing to the nucleic 
acid comprising ORFs A to F of the core region of the Mptb and 
Molvs pathogenicity islands of the invention. Fragments thereof 
corresponding to ORFs A to E, B to F, A to D, B to E, A to C, B 
10 to D or any two adjacent ORFs are also included in the invention. 

Polynucleotides of the invention will be capable of selectively 
hybridizing to the corresponding portion of the GS region, or to 
the corresponding ORFs of Mtb described herein. The term 
"selectively hybridizing" indicates that the polynucleotides will 

15 hybridize, under conditions of medium to high stringency (for 
example 0.03 M sodium chloride and 0.03 M sodium citrate at from 
about 50oc to about 60oC) to the corresponding portion of SEQ ID 
NO: 3 or 4 or the complementary strands thereof but not to genomic 
DNA from mycobacteria which are usually non-pathogenic including 

20 non-pathogenic species of M, avium. Such polynucleotides will 
generally be generally at least 68%, e.g. at least 70%, 
preferably at least 80 or 90% and more preferably at least 95% 
homologous to the corresponding DNA of GS. The corresponding 
portion will be of over a region of at least 20, preferably at 

25 least 30, for instance at least 40, 60 or 100 or more contiguous 
nucleotides . 

By "corresponding portion" it is meant a sequence from the GS 
region of the same or substantially similar size which has been 
determined, for example by computer alignment, to have the 
30 greatest degree of homology to the polynucleotide. 

Any combination of the above mentioned degrees of homology and 
minimum sizes may be used to define polynucleotides of the 
invention, with the more stringent combinations (i.e. higher 
homology over longer lengths) being preferred. Thus for example 
35 a polynucleotide which is at least 80% homologous over 25, 
preferably 30 nucleotides forms one aspect of the invention, as 
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does a polynucleotide which is at least 90% homologous over 40 
nucleotides . 



A further class of polynucleotides of the invention is the class 
of polynucleotides encoding polypeptides of the invention, the 
5 polypeptides of the invention being defined in section B below. 
Due to the redundancy of the genetic code as such, 
polynucleotides may be of a lower degree of homology than 
required for selective hybridization to the GS region. However, 
when such polynucleotides encode polypeptides of the invention 

10 these polynucleotides form a further aspect. It may for example 
be desirable where polypeptides of the invention are produced 
recombinant: ly to increase the GC content of such polynucleotides. 
This increase in GC content may result in higher levels of 
expression via codon usage more appropriate to the host cell in 

15 which recombinant expression is taking place. 

An additional class of polynucleotides of the invention are those 
obtainable from cosmids MTCY277 and MT024 (containing Mtb genomic 
sequences) , which polynucleotides consist essentially of the 
fragment of the cosmid containing an open reading frame encoding 

20 any one of the homologous ORFs B, C, E or F respectively. Such 
polynucleotides are referred to below as Mtb polynucleotides. 
However, where reference is made to polynucleotides in general 
such reference includes Aftib polynucleotides unless the context 
is explicitly to the contrary. In addition, the invention 

25 provides polynucleotides which encode the same polypeptide as the 
abovementioned ORFs of Mtb but which, due to the redundancy of 
the genetic code, have different nucleotide sequences. Thssa 
form further Mtb polynucleotides of the invention. Fragments of 
Mtb polynucleotides suitable for use as probes or primers also 

30 form a further aspect of the invention. 

The invention further provides polynucleotides in substantially 
isolated form capable of selectively hybridizing (where 
selectively hybridizing is as defined above) to the Mtb 
polynucleotides of the invention. 
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The invention further provides the Mtb polynucleotides of the 
invention linked, at either the 5' and/or 3' end to 
polynucleotide sequences to which they are not naturally 
contiguous. Such sequences will typically be sequences found in 
5 cloning or expression vectors, such as promoters, 5' untranslated 
sequence, 3' untranslated sequence or termination sequences , The 
sequences may also include further coding sequences such as 
signal sequences used in recombinant production of proteins. 

Further polynucleotides of the invention are illustrated in the 
10 accompanying examples . 

Polynucleotides of the invention may be used to produce a primer, 
e.g. a PCR primer, a primer for an alternative amplification 
reaction, a probe e.g. labelled with a revealing label by 
conventional means using radioactive or non- radioactive labels 

15 or a probe linked covalently to a solid phase, or the 
polynucleotides may be cloned into vectors. Such primers, 
probes and other fragments will be at least 15, preferably at 
least 20, for example at least 25, 30 or 40 or more nucleotides 
in length, and are also encompassed by the term polynucleotides 

20 of the invention as used herein. 

Primers of the invention which are preferred include primers 
directed to any part of the ORFs defined herein. The ORFs from 
other isolates of pathogenic mycobacteria which contain a GS 
region may be determined and conserved regions within each 

25 individual ORF may be identified. Primers directed to such 
conserved regions form a further preferred aspect of the 
invention. In addition, the primers and other polynucleotides 
of the invention may be used to identify, obtain and isolate ORFs 
capable of selectively hybridizing to the polynucleotides of the 

30 invention which are present in pathogenic mycobacteria but which 
are not part of a pathogenicity island in that particular species 
of bacteria. Thus in addition to the ORFs B, C, E and F which 
have been identified in Mtb, similar ORFs may be identified in 
other pathogens and ORFs corresponding to the GS ORFs C, D, E, 

35 F and H, may also be identified. 
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Polynucleotides such as DNA polynucleotides and probes according 
to the invention may be produced recombinant ly, synthetically, 
or by any means available to those of skill in the art. They may 
also be cloned by standard techniques. 

In general, primers will be produced by synthetic means, 
involving a step-wise manufacture of the desired nucleic acid 
sequence one nucleotide at a time. Techniques for accomplishing 
this using automated techniques are readily available in the art. 
Longer polynucleotides will generally be produced using 
recombinant means, for example using a PCR (polymerase chain 
reaction) cloning techniques. This will involve making a pair 
or primers (e.g. of about 15-30 nucleotides) to a region of GS, 
which it is desired to clone, bringing the primers into contact 
with genomic DNA from a mycobacterium or a vector carrying the 
GS sequence, performing a polymerase chain reaction under 
conditions which bring about amplification of the desired region, 
isolating the amplified fragment (e.g. by purifying the reaction 
mixture on an agarose gel) and recovering the amplified DNA. The 
primers may be designed to contain suitable restriction enzyme 
20 recognition sites so that the amplified DNA can be cloned into 
a suitable cloning vector. 

Such techniques may be used to obtain all or part of the GS or 
ORF sequences described herein, as well as further genomic clones 
containing full open reading frames. Although in general such 
25 techniques are well known in the art, reference may be made in 
particular to Sambrook J., Fritsch EF., Maniatis T (1989). 
Molecular cloning: a Laboratory Manual, 2nd edn. Cold Spring 
Harbor, New York, Cold Spring Harbor Laboratory. 

Polynucleotides which are not 100% homologous to the sequences 
30 of the present invention but fall within the scope of the 
invention can be obtained in a number of ways. 

Other isolates or strains of pathogenic mycobacteria will be 
expected to contain allelic variants of the GS sequences 
described herein, and these may be obtained for example by 
35 probing genomic DNA libraries made from such isolates or strains 
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of bacteria using GS or ORF sequences as probes under conditions 
of medium to high stringency (for example 0.03M sodium chloride 
and 0.03M sodium citrate at from about 50°C to about 60°C) . 

A particularly preferred group of pathogenic mycobacteria are 
5 isolates of M.paratuberculosis. Polynucleotides based on GS 
regions from such bacteria are particularly preferred. Preferred 
fragments of such regions include fragments encoding individual 
open reading frames including the preferred groups and 
combinations of open reading frames discussed above. 

10 Alternatively, such polynucleotides may be obtained by site 
directed mutagenesis of the GS or ORF sequences or allelic 
variants thereof. This may be useful where for example silent 
codon changes are required to sequences to optimise codon 
preferences for a particular host cell in which the 

15 polynucleotide sequences are being expressed. Other sequence 
changes may be desired in order to introduce restriction enzyme 
recognition sites, or to alter the property or function of the 
polypeptides encoded by the polynucleotides of the invention. 
Such altered property or function will include the addition of 

20 amino acid sequences of consensus signal peptides known in the 
art to effect transport and secretion of the modified polypeptide 
of the invention. Another altered property will include 
metagenesis of a catalytic residue or generation of fusion 
proteins with another polypeptide. Such fusion proteins may be 

25 with an enzyme, with an antibody or with a cytokine or other 
ligand for a receptor, to target a polypeptide of the invention 
to a specific cell type in vitro or in vivo. 

The invention further provides double stranded polynucleotides 
comprising a polynucleotide of the invention and its complement. 

30 Polynucleotides or primers of the invention may carry a revealing 
label. Suitable labels include radioisotopes such as 32 P or 3S S, 
enzyme labels, other protein labels or smaller labels such as 
biotin or f luorophores . Such labels may be added to 
polynucleotides or primers of the invention and may be detected 

35 using by techniques known per se. 
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Polynucleotides or primers of the invention or fragments thereof 
labelled or unlabelled may be used by a person skilled in the art 
in nucleic acid-based tests for the presence or absence of Mptb, 
Mavs, other GS -containing pathogenic mycobacteria, or Mtb applied 
to samples of body fluids, tissues, or excreta from animals and 
humans, as well as to food and environmental samples such as 
river or ground water and domestic water supplies. 

Human and animal body fluids include sputum, blood, serum, 
plasma, saliva, milk, urine, csf, semen, faeces and infected 
discharges. Tissues include intestine, mouth ulcers, skin, lymph 
nodes, spleen, lung and liver obtained surgically or by a biopsy 
technique. Animals particularly include commercial livestock 
such as cattle, sheep, goats, deer, rabbits but wild animals and 
animals in zoos may also be tested. 

Such tests comprise bringing a human or animal body fluid or 
tissue extract, or an extract of an environmental or food sample, 
into contact with a probe comprising a polynucleotide or primer 
of the invention under hybridising conditions and detecting any 
duplex formed between the probe and nucleic acid in the sample. 
Such detection may be achieved using techniques such as PCR or 
by immobilising the probe on a solid support, removing nucleic 
acid in the sample which is not hybridized to the probe, and then 
detecting nucleic acid which has hybridized to the probe. 
Alternatively, the sample nucleic acid may be immobilized on a 
solid support, and the amount of probe bound to such a support 
can be detected. Suitable assay methods of this any other 
formats can be found in for example WO89/03891 and WO90/13S€7. 

Polynucleotides of the invention or fragments thereof labelled 
or unlabelled may also be used to identify and characterise 
different strains of Mptb, Mavs, other GS-containing pathogenic 
mycobacteria, or Mtb t and properties such as drug resistance or 
susceptibility. 

The probes of the invention may conveniently be packaged in the 
form of a test kit in a suitable container. In such kits the 
probe may be bound to a solid support where the assay format for 
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which the kit is designed requires such binding. The kit may 
also contain suitable reagents for treating the sample to be 
probed, hybridising the probe to nucleic acid in the sample, 
control reagents, instructions, and the like. 

The use of polynucleotides of the invention in the diagnosis of 
inflammatory diseases such as Crohn's disease or sarcoidosis in 
humans or Johne's disease in animals form a preferred aspect of 
the invention. The polynucleotides may also be used in the 
prognosis of these diseases. For example, the response of a 
human or animal subject in response to antibiotic, vaccination 
or other therapies may be monitored by utilizing the diagnostic 
methods of the invention over the course of a period of treatment 
and following such treatment. 

The use of Mtb polynucleotides (particularly in the form of 
probes and primers) of the invention in the above-described 
methods form a further aspect of the invention, particularly for 
the detection, diagnosis or prognosis of Mtb infections. 

B. Polypeptides. 

Polypeptides of the invention include polypeptides in 
substantially isolated form encoded by GS. This includes the 
full length polypeptides encoded by the positive and 
complementary negative strands of GS. Each of the full length 
polypeptides will contain one of the amino acid sequences set out 
in Seq ID N0s:6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 and 
29. Polypeptides of the invention further include variants of 
such sequences, including naturally occurring allelic variants 
and synthetic variants which are substantially homologous to said 
polypeptides. In this context, substantial homology is regarded 
as a sequence which has at least 70%, e.g. 80%, 90%, 95% or 98% 
amino acid homology (identity) over 30 or more, e.g 40, 50 or 100 
amino acids. For example, one group of substantially homolgous 
polypeptides are those which have at least 95% amino acid 
identity to a* polypeptide of any one of Seq ID NOs:6, 8, 10, 12, 
14, 16, 18, 20, 22, 24, 26, 28 and 29 over their entire length. 
Even more preferably, this homology is 98%. 
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Polypeptides of the invention further include the polypeptide 
sequences of the homologous ORFs of Mtb, namely Seq ID Nos. 31, 
33, 35 , 37 and 39. Unless explicitly specified to the contrary, 
reference to polypeptides of the invention and their fragments 
5 include these Mtb polypeptides and fragments, and variants 
thereof (substanially homologous to said sequences) as defined 
herein. 

Polypeptides of the invention may be obtained by the standard 
techniques mentioned above. Polypeptides of the invention also 

10 include fragments of the above mentioned full length polypeptides 
and variants thereof, including fragments of the sequences set 
out in SEQ ID NOs:6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 
29, 31, 33, 35, 37 and 39. Such fragments for example of 8, 10, 
12, 15 or up to 30 or 40 amino acids may also be obtained 

15 synthetically using standard techniques known in the art. 

Preferred fragments include those which include an epitope, 
especially an epitope which is specific to the pathogenicity of 
the mycobacterial cell from which the polypeptide is derived. 
Suitable fragments will be at least about 5, e.g. 8, 10, 12, 15 
20 or 20 amino acids in size, or larger. Epitopes may be determined 
either by techniques such as peptide scanning techniques as 
described by Geysen et al, Mol . Immunol . , 23; 709-715 (1986), as 
well as other techniques known in the art. 

The term "an epitope which is specific to the pathogenicity of 
25 the mycobacterial cell" means that the epitope is encoded by a 
portion of the GS region, or by the corresponding ORF sequences 
of Mtb which can be used to distinguish mycobacteria which are 
pathogenic by from related non-pathogenic mycobacteria including 
non-pathogenic species of M. avium. This may be determined using 
30 routine methodology. A candidate epitope from an ORF may be 
prepared and used to immunise an animal such as a rat or rabbit 
in order to generate antibodies. The antibodies may then be used 
to detect the presence of the epitope in pathogenic mycobacteria 
and to confirm that non-pathogenic mycobacteria do not contain 
35 any proteins which react with the epitope. Epitopes may be 
linear or conformational. 
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Polypeptides of the invention may be in a substantially isolated 
form. It will be understood that the polypeptide may be mixed 
with carriers or diluents which will not interfere with the 
intended purpose of the polypeptide and still be regarded as 
substantially isolated. A polypeptide of the invention may also 
be in a substantially purified form, in which case it will 
generally comprise the polypeptide in a preparation in which more 
than 90%, e.g. 95%, 98% or 99% of the polypeptide in the 
preparation is a polypeptide of the invention. 

Polypeptides of the invention may be modified to confer a desired 
property or function for example by the addition of Histidine 
residues to assist their purification or by the addition of a 
signal sequence to promote their secretion from a cell. 

Thus, polypeptides of the invention include fusion proteins which 
comprise a polypeptide encoding all or part of one or more of an 
ORF of the invention fused at the N- or C- terminus to a second 
sequence to provide the desired property or function. Sequences 
which promote secretion from a cell include, for example the 
yeast a-f actor signal sequence. 

A polypeptide of the invention may be labelled with a revealing 
label. The revealing label may be any suitable label which 
allows the polypeptide to be detected. Suitable labels include 
radioisotopes, e.g. 125 I, 35 S enzymes, antibodies, polynucleotides 
and ligands such as biotin. Labelled polypeptides of the 
invention may be used in diagnostic procedures such as 
immunoassays in order to determine the amount of a polypeptide 
of the invention in a sample. Polypeptides or labelled 
polypeptides of the invention may also be used in serological or 
cell mediated immune assays for the detection of immune 
reactivity to said polypeptides in animals and humans using 
standard protocols. 

A polypeptide or labelled polypeptide of the invention or 
fragment thereof may also be fixed to a solid phase, for example 
the surface of an immunoassay well, microparticle, dipstick or 
biosensor. Such labelled and/or immobilized polypeptides may be 
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packaged into kits in a suitable container along with suitable 
reagents, controls, instructions and the like. 

Such polypeptides and kits may be used in methods of detection 
of antibodies or ceil mediated immunoreactivity, to the 
5 mycobacterial proteins and peptides encoded by the ORFs of the 
invention and their allelic variants and fragments, using 
immunoassay. Such host antibodies or cell mediated immune 
reactivity will occur in humans or animals with an immune system 
which detects and reacts against polypeptides of the invention. 
10 The antibodies may be present in a biological sample from such 
humans or animals, where the biological sample may be a sample 
as defined above particularly blood, milk or saliva. 

Immunoassay methods are well known in the art and will generally 
comprise : 

15 (a) providing a polypeptide of the invention comprising an 



(c) determining whether antibody-antigen complex 
comprising said polypeptide is formed. 

Immunoassay methods for cell mediated immune reactivity in 
animals and humans are also well known in the art (e,g. as 
25 described by Weir et al 1994, J. Immunol Methods 176.; 93-101} and 
will generally comprise 



epitope bindable by an antibody against said 
mycobacterial polypeptide; 



20 



(b) 



incubating a biological sample with said polypeptide 
under conditions which allow for the formation of an 
antibody-antigen complex; and 



(a) 



providing a polypeptide of the invention comprising an 
epitope bindable by a lymphocyte or macrophage or 
other cell receptor; 



30 



(b) 



incubating a cell sample with said polypeptide urrier 
conditions which allow for a cellular immune respond 
such as release of cytokines or other mediator 



occur ; and 



35 



(c) 



detecting the presence of said cytokine or mediator in 
the incubate. 
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Polypeptides of the invention may be made by standard synthetic 
means well known in the art or recombinantly, as described below. 

Polypeptides of the invention or fragments thereof labelled or 
unlabelled may also be used to identify and characterise 
5 different strains of Mptb, Mavs, other GS-containing pathogenic 
mycobacteria, or Mtb, and properties such as drug resistance or 
susceptibility. 

The polypeptides of the invention may conveniently be packaged 
in the form of a test kit in a suitable container. In such kits 
the polypeptide may be bound to a solid support where the assay 
format for which the kit is designed requires such binding. The 
kit may also contain suitable reagents for treating the sample 
to be examined, control reagents, instructions, and the like. 

The use of polypeptides of the invention in the diagnosis of 
inflammatory diseases such as Crohn's disease or sarcoidosis in 
humans or Johne's disease in animals form a preferred aspect of 
the invention. The polypeptides may also be used in the 
prognosis of these diseases. For example, the response of a 
human or animal subject in response to antibiotic or other 
therapies may be monitored by utilizing the diagnostic methods 
of the invention over the course of a period of treatment and 
following such treatment. 

The use of Mtb polypeptides of the invention in the above - 
described methods form a further aspect of the invention, 
25 particularly for the detection, diagnosis or prognosis of Mtb 
infections . 

Polypeptides of the invention may also be used in assay methods 
for identifying candidate chemical compounds which will be useful 
in inhibiting, binding to or disrupting the function of said 
30 polypeptides required for pathogenicity. In general, such assays 
involve bringing the polypeptide into contact with a candidate 
inhibitor compound and observing the ability of the compound to 
disrupt, bind to or interfer with the polypeptide. 
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There are a number of ways in which the assay may be formatted. 
For example, those polypeptides which have an enzymatic function 
may be assayed using labelled substrates for the enzyme, and the 
amount of, or rate of, conversion of the substrate into a product 
measured, e.g by chromatograpy such as HPLC or by a colourimetric 
assay. Suitable labels include 3S S, 125 I, biotin or enzymes such 
as horse radish peroxidase. 

For example, the gene product of ORF C is believed to have GDP- 
mannose dehydratase activty. Thus an assay for inhbitors of the 
gene product may utilise for example labelled GDP-mannose, GDP 
or mannose and the activity of the gene product followed. ORF 
D encodes a gene related to the synthesis and regulation of 
capuslar polysaccharides, which are often associated with 
invasiveness and pathogenicity. Labelled polysaccharide 
substrates may be used in assays of the ORF D gene product. The 
gene product of ORF F encodes a protein with putative glucosyl 
transferase activity and thus labelled amino sugars such as 0-1- 
3-N-acetylglucosamine may be used as substrates in assays. 

Candidate chemical compounds which may be used may be natural or 
synthetic chemical compounds used in drug screening programmes. 
Extracts of plants which contain several characterised or 
uncharacterised components may also be used. 

Alternatively, the a polypeptide of the invention may be screened 
against a panel of peptides, nucleic acids or other chemical 
functionalities which are generated by combinatorial chemistry. 
This will allow the definition of chemical entities which bind 
to polypeptides of the invention. Typically, the polypeptide of 
the invention will be brought into contact with a panel of 
compounds from a combinantorial library, with either the panel 
or the polypeptide being immobilized on a solid phase, under 
conditions suitable for the polypeptide to bind to the panel. 
The solid phase will then be washed under conditions in which 
only specific interactions between the polypeptide and individual 
members of the panel are retained, and those specific members may 
be utilized in further assays or used to design further panels 
of candidate ompounds . 
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For example, a number of assay methods to define peptide 
interaction with peptides are known. For example, WO8 6/00991 
describes a method for determining mimotopes which comprises 
making panels of catamer preparations, for example octamers of 
5 amino acids, at which one or more of the positions is defined and 
the remaining positions are randomly made up of other amino 
acids , determining which catamer binds to a protein of interest 
and re- screening the protein of interest against a further panel 
based on the most reactive catamer in which one or more 
10 additional designated positions are systematically varied. This 
may be repeated throughout a number of cycles and used to build 
up a sequence of a binding candidate compound of interest. 

WO89/03430 describes screening methods which permit the 
preparation of specific mimotopes which mimic the immunological 

15 activity of a desired analyte. These mimotopes are identified 
by reacting a panel of individual peptides wherein said peptides 
are of systematically varying hydrophobic ity, amphipathic 
characteristics and charge patterns, using an antibody against 
an antigen of interest. Thus in the present case antibodies 

20 against the a polypeptide of the inventoin may be employed and 
mimotope peptides from such panels may be identified. 

C. Vectors. 

Polynucleotides of the invention can be incorporated into a 
recombinant replicable vector. The vector may be used to 

25 replicate the nucleic acid in a compatible host cell. Thus in 
a further embodiment, the invention provides a method of making 
polynucleotides of the invention by introducing a polynucleotide 
of the invention into a replicable vector, introducing the vector 
into a compatible host cell, and growing the host cell under 

30 conditions which bring about replication of the vector. The 
vector may be recovered from the host cell. Suitable host cells 
are described below in connection with expression vectors. 



D. Expression Vectors 
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Preferably, a polynucleotide of the invention in a vector is 
operably linked to a control sequence which is capable of 
providing for the expression of the coding sequence by the host 
cell, i.e. the vector is an expression vector. The term "operably 

5 linked 11 refers to a juxtaposition wherein the components 
described are in a relationship permitting them to function in 
their intended manner. A control sequence "operably linked" to 
a coding sequence is ligated in such a way that expression of the 
coding sequence is achieved under conditions compatible with the 

10 control sequences. Such vectors may be transformed into a 
suitable host cell as described above to provide for expression 
of a polypeptide of the invention. Thus, in a further aspect the 
invention provides a process for preparing polypeptides according 
to the invention which comprises cultivating a host cell 

15 transformed or transf ected with an expression vector as described 
above, under conditions to provide for expression by the vector 
of a coding sequence encoding the polypeptides, and recovering 
the expressed polypeptides. 

A further embodiment of the invention provides vectors for the 

20 replication and expression of polynucleotides of the invention, 
or fragments thereof. The vectors may be for example, plasmid, 
virus or phage vectors provided with an origin of replication, 
optionally a promoter for the expression of the said 
polynucleotide and optionally a regulator of the promoter. The 

25 vectors may contain one or more selectable marker genes, for 
example an ampicillin resistance gene in the case of a bacterial 
plasmid or a neomycin resistance gene for a mammalian vector. 
Vectors may be used in vitro, for example for the production of 
RNA or used to transf ect or transform a host cell. The vector 

30 may also be adapted to be used in vivo, for example in a method 
of naked DNA vaccination or gene therapy. A further embodiment 
of the invention provides host cells transformed or transfected 
with the vectors for the replication and expression of 
polynucleotides of the invention, including the DNA of GS, trie 

35 open reading frames thereof and other corresponding ORFs 
particularly ORFs B, C, E and F from Af tJb . The cells will be 
chosen to be compatible with the said vector and may for example 
be bacterial, yeast, insect or mammalian. 
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Expression vectors are widely available in the art and can be 
obtained commercially. Mammalian expression vectors may comprise 
a mammalian or viral promoter. Mammalian promoters include the 
metallothionien promoter. Viral promoters include promoters from 
5 adenovirus, the SV40 large T promoter and retroviral LTR 
promoters. Promoters compatible with insect cells include the 
polyhedrin promoter. Yeast promoters include the alcohol 
dehydrogenase promoter. Bacterial promoters include the 
/3-galactosidase promoter. 

The expression vectors may also comprise enhancers, and in the 
case of eukaryotic vectors polyadenylation signal sequence 
downstream of the coding sequence being expressed. 

Polypeptides of the invention may be expressed in suitable host 
cells, for example bacterial, yeast, plant, insect and mammalian 
cells, and recovered using standard purification techniques 
including, for example affinity chromatography, HPLC or other 
chromatographic separation techniques. 

Polynucleotides according to the invention may also be inserted 
into the vectors described above in an antisense orientation in 
order to provide for the production of antisense RNA. Antisense 
RNA or other antisense polynucleotides or ligands may also be 
produced by synthetic means. Such antisense polynucleotides may 
be used in a method of controlling the levels of the proteins 
encoded by the ORFs of the invention in a mycobacterial cell. 

25 Polynucleotides of the invention may also be carried by vectors 
suitable for gene therapy methods, Such gene therapy methods 
include those designed to provide vaccination against diseases 
caused by pathogenic mycobacteria or to boost the immune response 
of a human or animal infected with a pathogenic mycobacteria. 

30 For example, Ziegner et al, AIDS, 1995, 9;43-50 describes the use 
of a replication defective recombinant amphotropic retrovirus to 
boost the immune response in patients with HIV infection. Such 
a retrovirus may be modified to carry a polynucleotide encoding 
a polypeptide or fragment thereof of the invention and the 



10 



15 



20 



WO 97/23624 



PCT/CB96/03221 



-24- 

retrovirus delivered to the cells of a human or animal subject 
in order to provide an immune response against said polypeptide. 
The retrovirus may be delivered directly to the patient or may 
be used to infecte cells ex- vivo, e.g. fibroblast cells, which 
5 are then introduced into the patient, optionally after being 
inactivated. The cells are desirably autologous or HLA-matched 
cells from the human or animal subject. 

Gene therapy methods including methods for boosting an immune 
response to a particluar pathogen are disclosed generally in for 

10 example WO95/14091, the disclosure of which is incoporated herein 
by reference. Recombinant viral vectors include retroviral 
vectors, adenoviral vectors, adeno-associated viral vectors, 
vaccinia virus vectors, herpes virus vectors and alphavirus 
vectors. Alpha virus vectors are described in, for example, 

15 WO95/07994, the disclosure of which is incorporated herein by 
reference . 

Where direct administration of the recombinant viral vector is 
contemplated, either in the form of naked nucleic acid or in the 
form of packaged particles carrying the nucleic acid this may be 

20 done by any suitable means, for example oral administration or 
intravenous injection. From 10 5 to 10 9 c.f.uof virus represents 
a typical dose, which may be repeated for example weekly over a 
period of a few months. Administration of autologous or HLA- 
matched cells infected with the virus may be more convenient in 

25 some cases. This will generally be achieved by administering 
doses, for example from 10 s to 10 8 cells per dose which may be 
repeated as described above. 

The recombinant viral vector may further comprise nucleic acid 
capable of expressing an accessory molecule of the immune system 

30 designed to increase the immune response. Such a moleclue may 
be for example and interferon, particularly interferon gamma, an 
interleukin, for example IL-la, IL-10 or IL-2, or an HLA class 
I or II moleclue. This may be particularly desirable where the 
vector is intended for use in the treatment of humans or animals 

35 already infected with a mycobacteria and it is desired to boost 
the immune response . 
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E. Antibodies. 

The invention also provides monoclonal or polyclonal antibodies 
to polypeptides of the invention or fragments thereof. The 
invention further provides a process for the production of 
monoclonal or polyclonal antibodies to polypeptides of the 
invention. Monoclonal antibodies may be prepared by conventional 
hybridoma technology using the polypeptides of the invention or 
peptide fragments thereof, as immunogens. Polyclonal antibodies 
may also be prepared by conventional means which comprise 
inoculating a host animal, for example a rat or a rabbit, with 
a polypeptide of the invention or peptide fragment thereof and 
recovering immune serum. 

In order that such antibodies may be made, the invention also 
provides polypeptides of the invention or fragments thereof 
haptenised to another polypeptide for use as immunogens in 
animals or humans. 

For the purposes of this invention, the term "antibody", unless 
specified to the contrary, includes fragments of whole antibodies 
which retain their binding activity for a polypeptide of the 
invention. Such fragments include Fv, F(ab') and F(ab') 2 
fragments, as well as single chain antibodies. Furthermore, the 
antibodies and fragments thereof may be humanised antibodies, 
e.g. as described in EP-A-239400. 

Antibodies may be used in methods of detecting polypeptides of 
the invention present in biological samples (where such samples 
include the human or animal body samples, and environmental 
•samples, mentioned above) by a method which comprises: 

(a) providing an antibody of the invention; 

(b) incubating a biological sample with said antibody 
under conditions which allow for the formation of an 
antibody -antigen complex; and 

(c) determining whether antibody -antigen complex 
comprising said antibody is formed. 
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Antibodies of the invention may be bound to a solid support: for 
example an immunoassay well, microparticie , dipstick or biosensor 
and/or packaged into kits in a suitable container along with 
suitable reagents, controls, instructions and the like. 

Antibodies of the invention may be used in the detection, 
diagnosis and prognosis of diseases as descirbed above in 
relation to polypeptides of the invention. 

F . Compos it ions . 



The present invention also provides compositions comprising a 
polynucleotide or polypeptide of the invention together with a 
carrier or diluent. Compositions of the invention also include 
compositions comprising a nucleic acid; particularly and 
expression vector, of the invention. Compositions further 
include those carrying a recombinant virus of the invention. 
Such compositions include pharmaceutical compositions in which 
case the carrier or diluent will be pharmaceutical^ acceptable. 

Pharmaceutical^ acceptable carriers or diluents include those 
used in formulations suitable for inhalation as well as oral, 
parenteral (e.g. intramuscular or intravenous or transcutaneous) 
administration. The formulations may conveniently be presented 
in unit dosage form and may be prepared by any of the methods 
well known in the art of pharmacy. Such methods include the step 
of bringing into association the active ingredient with the 
carrier which constitutes one or more accessory ingredients. In 
general the formulations are prepared by uniformly and intimately 
bringing into association the active ingredient with liquid 
carriers or finely divided solid carriers or both, and then, if 
necessary, shaping the product. 

For example, formulations suitable for parenteral administration 
include aqueous and non- aqueous sterile injection solutions which 
may contain ant i- oxidants, buffers, bacteriostats and solutes 
which render the formulation 'isotonic with the blood of the 
intended recipient, and aqueous and non-aqueous sterile 
suspensions which may include suspending agents and thickening 
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agents, and liposomes or other raicroparticulate systems which are 
designed to target the polynucleotide or the polypeptide of the 
invention to blood components or one or more organs, or to target 
cells such as M cells of the intestine after oral administration. 

G. Vaccines . 

In another aspect, the invention provides novel vaccines for the 
prevention and treatment of infections caused by Mptb, Mavs, 
other GS -containing pathogenic mycobacteria and Mtb in animals 
and humans. The term "vaccine" as used herein means an agent 
used to stimulate the immune system of a vertebrate, particularly 
a warm blooded vertebrate including humans, so as to provide 
protection against future harm by an organism to which the 
vaccine is directed or to assist in the eradication of an 
organism in the treatment of established infection. The immune 
system will be stimulated by the production of cellular immunity 
antibodies, desirably neutralizing antibodies, directed to 
epitopes found on or in a pathogenic mycobacterium which 
expresses any one of the ORFs of the invention. The antibody so 
produced may be any of the immunological classes, such as the 
immunoglobulins A, D, E, G or M. Vaccines which stimulate the 
production of IgA are interest since this is the principle 
immunoglobulin produced by the secretory system of warm-blooded 
animals, and the production of such antibodies will help prevent 
infection or colonization of the intestinal tract. However an 
IgM and IgG response will also be desirable for systemic 
infections such as Crohn's disease or tuberculosis. 

Vaccines of the invention include polynucleotides of the 
invention or fragments thereof in suitable vectors and 
administered by injection of naked DNA using standard protocols. 
Polynucleotides of the invention or fragments thereof in suitable 
vectors for the expression of the polypeptides of the invention 
may be given by injection, inhalation or by mouth. Suitable 
vectors include M.bovis BCG, M.smegmatis or other mycobacteria, 
Corynebacteria, Salmonella or other agents according to 
established protocols. 
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Polypeptides of the invention or fragments thereof in 
substantially isolated form may be used as vaccines by injection, 
inhalation, oral administration or by transcutaneous application 
according to standard protocols. Adjuvants (such as Iscoms or 
polylactide-coglycolide encapsulation) , cytokines such as IL-12 
and other immunomodulators may be used for the selective 
enhancement of the cell mediated or humoral immunological 
responses. Vaccination with polynucleotides and/or polypeptides 
of the invention may be undertaken to increase the susceptibility 
of pathogenic mycobacteria to antimicrobial agents in vivo. 

In instances wherein the polypeptide is correctly configured so 
as to provide the correct epitope, but is too small to be 
immunogenic, the polypeptide may be linked to a suitable carrier. 

A number of techniques for obtaining such linkage are known in 
the art, including the formation of disulfide linkages using N- 
succinimidyl-3- (2-pyridylthio) propionate (SPDP) and succinimidyl 
4- (N-maleimido-methyl) cyclohexane-l-carboxylate (SMCC) obtained 
from Pierce Company, Rockford, Illinois, (if the peptide lacks 
a sulfhydryl group, this can be provided by addition af**Sr 
cysteine residue} . These reagents create a disulfide linkage 
between themselves and peptide cysteine residues on one protein 
and an amide linkage through the epsilon-amino on a lysine, or 
other free amino group in the other. A variety of such 
disulfide/amide- forming agents are known. See, for example, 
Immun Rev (1982) 62:185. Other bifunctional coupling agents form 
a thioether rather than a disulfide linkage. Many of these thio- 
ether- forming agents are commercially available and include 
reactive esters of 6-maleimidocaproic acid, 2-bromoacetic acid, 
2-iodoacetic acid, 4- (N-maleimido-methyl) cyclohexane-l-carboxylic 
acid, and the like. The carboxyl group can be activated by 
combining them with succinimide or l-hydroxyl-2 -nitro-4-sulf onic 
acid, sodium salt. Additional methods of coupling antigens 
employs the rotavirus/ "binding peptide" system described in EPO 
Pub. No. 259,149, the disclosure of which is incorporated herein 
by reference. The foregoing list is not meant to be exhaustive, 
and modifications of the named compounds can clearly be used. 
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Any carrier may be used which does not itself induce the 
production of antibodies harmful to the host. Suitable carriers 
are typically large, slowly metabolized raacromolecules such as 
proteins; polysaccharides, such as latex functionalized 

5 Sepharose®, agarose, cellulose, cellulose beads and the like; 
polymeric amino acids, such as polyglutamic acid, polylysine, 
polylactide-coglycolide and the like; amino acid copolymers; and 
inactive virus particles. Especially useful protein substrates 
are serum albumins, keyhole limpet hemocyanin, immunoglobulin 

3 molecules, thyroglobulin, ovalbumin, tetanus toxoid, and other 
proteins well known to those skilled in the art. 

The immunogenic ity of the epitopes may also be enhanced by 
preparing them in mammalian or yeast systems fused with or 
assembled with particle- forming proteins such as, for example, 

5 that associated with hepatitis B surface antigen. See, e.g., US- 
A-4,722,840. Constructs wherein the epitope is linked directly 
to the particle- forming protein coding sequences produce hybrids 
which are immunogenic with respect to the epitope. In addition, 
all of the vectors prepared include epitopes specific to HBV, 

:0 having various degrees of immunogenic ity, such as, for example, 
the pre-S peptide. 

In addition, portions of the particle-forming protein coding 
sequence may be replaced with codons encoding an epitope of the 
invention. In this replacement, regions which are not required 
15 to mediate the aggregation of the units to form immunogenic 
particles in yeast or mammals can be deleted, thus eliminating 
additional HBV antigenic sites from competition with the epitope 
of the invention. 

Vaccines may be prepared from one or more immunogenic 
30 polypeptides of the invention. These polypeptides may be 
expressed in various host cells (e.g., bacteria, yeast, insect, 
or mammalian cells) , or alternatively may be isolated from viral 
preparations or made synthetically. 

In addition to the above, it is also possible to prepare live 
35 vaccines of attenuated microorganisms which express one or more 
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recombinant: polypeptides of the invention. Suitable attenuated 
microorganisms are known in the art and include, for example, 
viruses (e.g., vaccinia virus), as well as bacteria. 

The preparation of vaccines which contain an immunogenic 
polypeptide (s) as active ingredients, is known to one skilled in 
the art. Typically; such vaccines are prepared as injectables, 
or as suitably encapsulated oral preparations and either liquid 
solutions or suspensions; solid forms suitable for solution in, 
or suspension in, liquid prior to in jest ion or injection may also 
be prepared. The preparation may also be emulsified, or the 
protein encapsulated in liposomes. The active immunogenic 
ingredients are often mixed with excipients which are 
pharmaceutically acceptable and compatible with the active 
ingredient. Suitable excipients are, for example, water, saline, 
dextrose, glycerol, ethanol, or the like and combinations 
thereof. In addition, if desired, the vaccine may contain minor 
amounts of auxiliary substances such as wetting or emulsifying 
agents, pH buffering agents, and/or adjuvants which enhance the 
effectiveness of the vaccine. Examples of adjuvants which may 
be effective include but are not limited to: aluminum hydroxide, 
N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP) , N-acetyl- 
nor-muramyl-L-alanyl-D-isoglutamine (CGP 11637, referred to as 
nor-MDP) , N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2- 
(1' -2' -dipalmitoyl-sn-glycero-3-hydroxyphosphoryloxy) -ethylamine 
(CGP 19835A, referred to as MTP-PE) , and RIB I, which contains 
three components extracted from bacteria, monophosphoryl lipid 
A, trehalose dimycolate and cell wall skeleton (MPL+TDM+CWS) in 
a 2% squalene/Tween® 80 emulsion. The effectiveness of an 
adjuvant may be determined by measuring the amount of antibodies 
directed against an immunogenic polypeptide containing an 
antigenic sequence resulting from administration of this 
polypeptide in vaccines which are also comprised of the various 
adjuvants . 

The vaccines are conventionally administered parenterally, by 
injection, for example, either subcutaneously or intramuscularly. 
Additional formulations which are suitable for other modes of 
administration include suppositories, oral formulations or as 
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enemas. For suppositories, traditional binders and carriers may 
include, for example, polyalkylene glycols or triglycerides; such 
suppositories may be formed from mixtures containing the active 
ingredient in the range of 0.5% to 10%, preferably 1% - 2%. Oral 
formulations include such normally employed excipients as, for 
example, pharmaceutical grades of mannitol, lactose, starch, 
magnesium stearate, sodium saccharine, cellulose, magnesium 
carbonate, and the like. These compositions take the form of 
solutions, suspensions, tablets, pills, capsules, sustained 
release formulations or powders and contain 10% - 95% of active 
ingredient, preferably 25% - 70%. 

The proteins may be formulated into the vaccine as neutral or 
salt forms. Pharmaceutically acceptable salts include the acid 
addition salts {formed with free amino groups of the peptide) and 
which are formed with inorganic acids such as, for example, 
hydrochloric or phosphoric acids, or such organic acids such as 
acetic, oxalic, tartaric, maleic, and the like. Salts formed 
with the free carboxyl groups may also be derived from inorganic 
bases such as, for example, sodium, potassium, ammonium, calcium, 
or ferric hydroxides, and such organic bases as isopropylamine, 
trimethylamine, 2-ethylamino ethanol, histidine, procaine, and 
the like. 

The vaccines are administered in a manner compatible with the 
dosage formulation, and in such amount as will be 
prophylactically and/or therapeutically effective. The quantity 
to be administered, which is generally in the range of 5/zg to 
250/xg, of antigen per dose, depends on the subject to be treated, 
capacity of the subject's immune system to synthesize antibodies, 
mode of administration and the degree of protection desired. 
Precise amounts of active ingredient required to be administered 
may depend on the judgement of the practitioner and may be 
peculiar to each subject. 

The vaccine may be given in a single dose schedule, or preferably 
in a multiple dose schedule. A multiple dose schedule is one in 
which a primary course of vaccination may be with 1-10 separate 
doses, followed by other doses given at subsequent time intervals 
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required to maintain and or reenforce the immune response, for 
example, at 1-4 months for a second dose, and if needed, a 
subsequent dose(s) after several months. The dosage regimen will 
also, at least in part, be determined by the need of the 
5 individual and be dependent upon the judgement of the 
practitioner. 

In a further aspect of the invention, there is provided an 
attenuated vaccine comprising a normally pathogenic mycobacteria 
which harbours an attenuating mutation in any one of the genes 
10 encoding a polypeptide of the invention. The gene is selected 
from the group of ORFs A, B, C, D, E, F, G and H, including the 
homologous ORFs B, C, E and F in Mtb. 

The mycobacteria may be used in the form of killed bacteria or 
as a live attenuated vaccine. There are advantages to a live 

15 attenuated vaccine. The whole live organism is used, rather than 
dead cells or selected cell components which may exhibit modified 
or denatured antigens. Protein antigens in the outer membrane 
will maintain their tertiary and quaternary structures. 
Therefore the potential to elicit a good protective long term 

20 immunity should be higher. 

The term "mutation" and the like refers to a genetic lesion in 
a gene which renders the gene non- functional . This may be at 
either the level of transcription or translation. The term thus 
envisages deletion of the entire gene or substantial portions 
25 thereof, and also point mutations in the coding sequence which 
result in truncated gene products unable to carry out the normal 
function of the gene. 

A mutation introduced into a bacterium of the invention will 
generally be a non-reverting attenuating mutation. Non-reverting 
30 means that for practical purposes the probability of the mutated 
gene being restored to its normal function is small, for example 
less than 1 in 10 s such as less than 1 in 10 9 or even less than 
1 in 10 12 . 
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An attenuated mycobacteria of the invention may be in isolated 
form. This is usually desirable when the bacterium is to be used 
for the purposes of vaccination. The term "isolated" means that 
the bacterium is in a form in which it can be cultured, processed 

5 or otherwise used in a form in which it can be readily identified 
and in which it is substantially uncontaminated by other 
bacterial strains, for example non-attenuated parent strains or 
unrelated bacterial strains. The term "isolated bacterium 11 thus 
encompasses cultures of a bacterial mutant of the invention, for 

10 example in the form of colonies on a solid medium or in the form 
of a liquid culture, as well as frozen or dried preparations of 
the strains . 

In a preferred aspect, the attenuated mycobacterium further 
comprises at least one additional mutation. This may be a 

15 mutation in a gene responsible for the production of products 
essential to bacterial growth which are absent in a human or 
animal host. For example, mutations to the gene for aspartate 
semi -aldehyde dehydrogenase (asd) have been proposed for the 
production of attenuated strains of Salmonella. The asd gene is 

20 described further in Gene (1993) 129; 123-128. A lesion in the 
asd gene, encoding the enzyme aspartate semi aldehyde 
dehydrogenase would render the organism auxotrophic for the 
essential nutrient diaminopelic acid (DAP) , which can be provided 
exogenously during bulk culture of the vaccine strain. Since 

25 this compound is an essential constituent of the cell wall for 
gram-negative and some gram-positive organisms and is absent from 
mammalian or other vertebrate tissues, mutants would undergo 
lysis after about three rounds of division in such tissues. 
Analogous mutations may be made to the attenuated mycobacteria 

30 of the invention. 

In addition or in the alternative, the attenuated mycobacteria 
may carry a recA mutation. The recA mutation knocks out 
homologous recombination - the process which is exploited for the 
construction of the mutations. Once the recA mutation has been 
35 incorporated the strain will be unable to repair the constructed 
deletion mutations. Such a mutation will provide attenuated 
strains in which the possibility of homologous recombination to 
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with DNA from wild- type strains has been minimized. RecA genes 
have been widely studied in the art and their sequences are 
available. Further modifications may be made for additional 
safety. 

5 The invention further provides a process for preparing a vaccine 
composition comprising an attenuated bacterium according to the 
invention process comprises (a) inoculating a culture vessel 
containing a nutrient medium suitable for growth of said 
bacterium; (b) culturing said bacterium; (c) recovering said 
10 bacteria and (d) mixing said bacteria with a pharmaceutical ly 
acceptable diluent or carrier. 

Attenuated bacterial strains according to the invention may be 
constructed using recombinant DNA methodology which is known per 
se. In general, bacterial genes may be mutated by a process of 

15 targeted homologous recombination in which a DNA construct 
containing a mutated form of the gene is introduced into a host 
bacterium which it is desired to attenuate. The construct will 
recombine with the wild- type gene carried by the host and thus 
the mutated gene may be incorporated into the host genome to 

20 provide a bacterium of the present invention which may then be 
isolated. 

The mutated gene may be obtained by introducing deletions into 
the gene, e.g by digesting with a restriction enzyme which cuts 
the coding sequence twice to excise a portion of the gene and 

25 then religating under conditions in which the excised portion is 
not reintroduced into the cut gene. Alternatively frame shift 
mutations may be introduced by cutting with a restriction enzyme 
which - leaves overhanging 5' and 3' termini, filling in and/or 
trimming back the overhangs, and religating. Similar mutations 

30 may be made by site directed mutagenesis. These are only 
examples of the types of techniques which will readily be at the 
disposal of those of skill in the art . 

Various assays are available to detect successful recombination. 
In the case of attenuations which mutate a target gene necessary 
35 for the production of an essential metabolite or catabolite 
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compound, selection may be carried out by screening for bacteria 
unable to grow in the absence of such a compound. Bacteria may 
also be screened with antibodies or nucleic acids of the 
invention to determine the absence of production of a mutated 
5 gene product of the invention or to confirm that the genetic 
lesion introduced - e.g. a deletion - has been incorporated into 
the genome of the attenuated strain. 

The concentration of the attenuated strain in the vaccine will 
be formulated to allow convenient unit dosage forms to be 

10 prepared. Concentrations of from about 10 4 to 10 9 bacteria per 
ml will generally be suitable, e.g. from about 10 s to 10 B such as 
about 10 6 per ml. Live attenuated organisms may be administered 
subcutaneous ly or intramuscularly at up to 10 8 organisms in one 
or more doses, e.g from around 10 s to 10 8 , e.g about 10 s or 10 7 

15 organisms in a single dose. 

The vaccines of the invention may be administered to recipients 
to treat established disease or in order to protect them against 
diseases caused by the corresponding wild type mycobacteria, such 
as inflammatory diseases such as Crohn's disease or sarcoidosis 
20 in humans or Johne's disease in animals. The vaccine may be 
administered by any suitable route. In general, subcutaneous or 
intramuscular injection is most convenient, but oral, intranasal 
and colorectal administration may also be used. 

The following Examples illustrates aspects of the invention, 
25 EXAMPLE 1 

Tests for th#^«presence of the GS identifier sequence were 
performed on 5^1 bacterial DNA extracts (25 fig/ml to 500 /xg/ml) 
using polymerase chain reaction based on the oligonucleotide 
primers 5 ' -GATGCCGTGAGGAGGTAAAGCTGC-3 ' (Seq ID No. 40) and 5'- 
30 GATACGGCTCTTGAATCCTGCACG-3 ' (Seq ID No. 41) from within the 
identifier DNA sequences (Seq. ID Nos 1 and 2) . PCR was performed 
for 40 cycles in the presence of 1.5 mM magnesium and an 
annealing temperature of 58°C. The presence or absence of the 
correct amplification product indicated the presence or absence 



WO 97/23624 



PCT/GB96/03221 



- 36 - 

of GS identifier sequence in the corresponding bacterium. GS 
identifier sequence is shown to be present in all the laboratory 
and field strains of Mptb and Mavs tested. This includes MptJb 
isolates 0025 (bovine CVL Weybridge) , 0021 (caprine, Moredun) , 
5 0022 (bovine; Moredun), 0139 (human, Chiodini 1384} , 0209, 
0208, 0211, 0210, 0212, 0207, 0204, 0206 {bovine, Whipple 1990). 
All Mptb strains were IS900 positive. The Mavs strains include 
0010 and 0012 (woodpigeon, Thorel) 0018 (armadillo, Portaels) and 
0034, 0037, 0038, 0040 (AIDS, Hoffner) . All Mavs strains were 

10 IS902 positive. One pathogenic M, avium strain 0033 (AIDS, 
Hoffner) also contained GS identifier sequence. GS identifier 
sequence is absent from other mycobacteria including other 
M. avium, M.malmoense, M.szulgai, M.gordonae, M.chelonei, 
M. for tui turn, M.phlei, as well as E.coli, S.areus, Nocardia sp, 

15 Streptococcus sp. Shigella sp. Pseudomoxias sp. 

Example 2 : 

To obtain the full sequence of GS in Mavs and Mptb we generated 
a genomic library of Mavs using the restriction endonuclease 
EcoRI and cloning into the vector pUC18. This achieved a 

20 representative library which was screened with 32 P- labelled 
identifier sequence yielding a positive clone containing a 17kbp 
insert. We constructed a restriction map of this insert and 
identified GS as fragments unique to Mavs and Mptb and not 
occurring in laboratory strains of M. avium. These fragments 

25 were sub-cloned into pUC18 and pGEM4Z. We identified GS 
contained within an 8kb region- The full nucleotide sequence 
was determined for GS on both DNA strands using primer walking 
and automated DNA sequencing. DNA sequence for GS in Mptb was 
obtained using overlapping PCR products generated using PwoDNA 

30 polymerase, a proofreading thermostable enzyme. The final DNA 
sequences were derived using the University of Wisconsin GCG gel 
assembly software package. 

Example 3 : 

The DNA sequence of GS in Mavs and Mptb was found to be more 
35 than 99% homologous. The ORFs encoded in GS were identified 
using GeneRunner and DNAStar computer programmes. Eight ORFs 
were identified and designated GSA, GSB , GSC, GSD, GSE , GSF, GSG 
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and GSH. Database comparisons were carried out against the 
GenEMBL Database release version 48.0 (9/96) , using the BLAST and 
BLIXEM programmes. GSA and GSB encoded proteins of 13.5kDa and 
30.7kDa respectively, both of unknown functions. GSC encoded 
a protein of 38.4kDa with a 65% homology to the amino acid 
sequence of rfbD of V. cholerae, a 62% amino acid sequence 
homology to gmd of E.coli and a 58% homology to gca of 
Ps . aeruginosa which are all GDP-D-mannose dehydratases. 
Equivalent gene products in H. influenzae, S.dysenteriae, 
Y. enterocolitica, N. gonorrhoea, K. pneumoniae and rfbD in 
Salmonella enterica are all involved in '0' -antigen processing 
known to be linked to pathogenicity. GSD encoded a protein of 
37.1kDa which showed 58% homology at the DNA level to wcaG from 
E.coli, a gene involved in the synthesis and regulation of 
capsular polysaccharides, also related to pathogenicity. GSE 
was found to have a > 30% amino acid homology to rfbT of 
V. cholerae, involved in the transport of specific LPS components 
across the cell membrane. In V. cholerae the gene product causes 
a seroconversion from the Inaba to the Ogawa 'epidemic' strain. 

GSF encoded a protein of 3 0.2kDa which was homologous in the 
range 25-40% at the amino acid level to several glucosyl 
transferases such as rfpA of K. pneumoniae, rfbB of K. pneumoniae, 
IgtD of H. influenzae, lsi of N . gonorrhoae . In E.coli an 
equivalent gene galE adds 0-1-3 N-acetylglucosamine to galactose, 
the latter only found in 'O' and ' M' antigens which are also 
related to pathogenicity. GSH comprising the ORFs GSH X and GSH 2 
encodes a protein totalling about 60kDa which is a putative 
transposase with a 40 - 43% homology at the amino acid level to 
the equivalent gene product of IS22 in E.coli. This family of 
insertion sequences is broadly distributed amongst gram negative 
bacteria and is responsible for mobility and transposition of 
genetic elements. An IS21- like element in B.fragilis is split 
either side of the ^-lactamase gene controlling its activation 
and expression. We programmed an E.coli S30 cell-free extract 
with plasmid DNA containing the ORF GSH under the control of a 
lac promoter in the presence of a 3S S -methionine, and 
demonstrated the translation of an abundant 60kDa protein. 
The proteins homologous to GS encoded in other organisms are in 
general highly antigenic . Thus the proteins encoded by the ORFs 
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in GS may be used in immunoassays of antibody or cell mediated 
immuno- reactivity for diagnosing infections caused by 
mycobacteria, particularly Mptb, Mavs and Mtb. Enhancement of 
host immune recognition of GS encoded proteins by vaccination 
5 using naked specific DNA or recombinant GS proteins, may be used 
in the prevention and treatment of infections caused by Mptb, 
Mavs and Mtb in humans and animals. Mutation or deletion of all 
or some of the ORFs A to H in GS may be used to generate 
attenuated strains of Mptb, Mavs or Mtb with lower pathogenicity 
10 for use as living or killed vaccines in humans and animals. Such 
vaccines are particularly relevant to Johne's disease in animals, 
to diseases caused by Mptb in humans such as Crohn's disease, and 
to the management of tuberculosis especially where the disease 
is caused by multiple drug- resistant organisms. 
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SEQUENCE LISTING 

Seq. ID No.l 

5'- 1 GATCCAACTA AACCCGATGG MCCCCGCGC AAACTATTGG ACGTCTCCGC GCTACGCAGT 
61 TGGGTTGGCG CCCGCGMTC GCACTGAAAG AGGGCATCGA TGCAACGGTG TCGTGGTACC 

5 121 GCACAAATGC CGATGCCGTG AGGAGGTAAA GCTGCGGGCC GGCCGATGTT ATCCCTCCGG 

181 CCGGACGGGT AGGGCGACCT GCCATCGAGT GGTACGGCAG TCGCCTGGCC GGCGAGGCGC 
241 ATGGCCTATG TGAGTATCCC ATAGCCTGGC TTGGCTCGCC CCTACGCATT ATCAGTTGAC 
301 CGCTTTCGCG CCACGTCGCA GGCTTGCGGC AGCATCCCGT TCAGGTCTCC TCATGGTCCG 
361 GTGTGGCACG ACCACGCMG CTCGAACCGA CTCGTTTCCC AATTTCGCAT GCTMTATCG 

10 421 CTCGATGGAT TTTTTGCGCA ACGCCGGCTT GATGGCTCGT AACGTTAGCA CCGAGATGCT 

481 GCGCCACTCC GMCGAAAGC GCCTATTAGT AAACCAAGTC GMGCATACG GAGTCMCGT 
541 TGTTATTGAT GTCGGTGCTA ACTCCGGCCA GTTCGGTAGC GCTTTGCGTC GTGCAGGATT 
601 CAAGAGCCGT ATCGTTTCCT TTGAACCTCT TTCGGGGCCA TTTGCGCAAC TAACGCGCAA 
661 GTCGGCATCG GATC -3' 



15 Seq. 



25 



ID 


NO. 2 




- X 


GATCCGATGC 


CGACTTGCGC 


61 


CGATACGGCT 


CTTGAATCCT 


121 


CGACATCAAT 


AACAACGTTG 


iai 


GTTCGGAGTG 


GCGCAGCATC 


241 


AAAAATCCAT 


CGAGCGATAT 


301 


TGGTCGTGCC 


ACACCGGACC 


361 


GTGGCGCGAA 


AGCGGTCAAC 


421 


CTCACATAGG 


CCATGCGCCT 


481 


CCCTACCCGT 


CCGGCCGGAG 


541 


ATCGGCATTT 


GTGCGGTACC 


601 


CGGGCGCCAA 


CCCAACTGCG 


661 


GGTTTAGTTG 


GATC -3' 



GTTAGTTGCG CAAATGGCCC CGAAAGAGGT TCAAAGGAAA 
GCACGACGCA AAGCGCTACC GAACTGGCCG GAGTTAGCAC 
ACTCCGTATG CTTCGACTTG GTTTACTAAT AGGCGCTTTC 
TCGG7GCTAA CGTTACGAGC CATCAAGCCG GCGTTGCGCA 
TAGCATGCGA AATTGGGAAA CGAGTCGGTT CGAGCTTGCG 
ATGAGGAGAC CTGAACGGGA TGCTGCCGCA AGCCTGCGAC 
TGATAATGCG TAGGGGCGAG CCAAGCCAGG CTATGGGATA 
CGCCGGCCAG GCGACTGCCG TACCACTCGA TGGCAGGTCG 
GGATAACATC GGCCGGCCCG CAGCTTTACC TCCTCACGGC 
ACGACACCGT TGCATCGATG CCCTCTTTCA GTGCGATTCG 
TAGCGCGGAG ACGTCCAATA GTTTGCGCGG GGTTCCATCG 
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Seq. ID No. 3 



I GAATTCTGGG TTGGAGACGA CG7CGAACTC CTGGTCGGTC TTGCTTCGAA 

51 TGATCGCTGT GATCTGGTCG GCGGTGCCGA CAGGAACCGT CGACTTGTCG 

101 ACGATCACCT TGTACCGGTC GATGTATGAC CCAATGTCGT CCGCAACCGA 

5 151 GAAGACGTAC GTCAGGTCCG CCGCCCCGCT TTCACCCATG GGCGTCGGGA 

201 CGGCGATGAA AATGACGTCC GCGTGCTCGA TTCCGCGTTG CCGGTCGGTG 

251 GTGAAGTCAA TCAGCCCGTT CTCACGGTTC CTCGCAATCA ACTCCCAACC 

301 CGGGCTCGAA AATCGGGACA CTGCCTGCGA GGAGCAAATC GATCTTGGCC 

351 TGATCGATAT CGACACAGAC GACATCGTTG CCGCTATCCG CGAGACAGGC 

10 401 GCCCGTGACG AGGCCTACAT AGCCTGATCC GACCACCGAA ATTTTCAAGA 

451 TGACCCCTTC AAGTCCCCGA TCGGTCGACG ACCATACTGC CGCAACTCTG 

501 TACCCTCCGT GGGTAATTCG CATGTCGCGT TCGTAAGGAG CAGCCAGCGA 

551 GTCGGGGACG TTCGGTGAGA GAGTCGCAGG ACTACGAGGT TGCCGGTGCG 

€01 ATACATCACA GTGTTGCGTC TGTCGGCAAC GATGCAGCAA GAACCCACGG 

15 651 GGCAGCCCTG AACTGCGCGC ATGACCGGTC CTTGTCCTGG CACCTTTGAT 

701 CGGCCACCGC TTCCATGCGA ACATGACCGG AATCCATAGC GCGTGGTCAA 

751 GCAGCGGGGA GGTAGACGTC GGTGTCATCT GCTCCAACCG TGTCGGTGAT 

801 AACGATTTCG CTGAACGATC TCGAGGGATT GAAAAGCACC GTGGAGAGCG 

851 TTCGCGCGCA GCGCTATGGG GGGCGAATCG AGCACATCGT CATCGACGGT 

20 901 GGATCGGGCG ACGCCGTCGT GGAGTATCTG TCCGGCGATC CTGGCTTTGC 

551 ATATTGGCAA TCTCAGCCCG ACAACGGGAG ATATGACGCG ATGAATCAGG 

1001 GCATTGCCCA TTCGTCGGGC GACCTGTTGT GGTTTATGCA CTCCACGGAT 

1051 CGTTTCTCCG ATCCAGATGC AGTCGCTTCC GTGGTGGAGG CGCTCTCGGG 

1101 GCATGGACCA GTACGTGATT TGTGGGGTTA CGGGAAAAAC AACCTTGTCG 

25 1151 GACTCGACGG CAAACCACTT TTCCCTCGGC CGTACGGCTA TATGCCGTTT 

1201 AAGATGCGGA AATTTCTGCT CGGCGCGACG GTTGCGCATC AGGCGACATT 

1251 CTTCGGCGCG TCGCTGGTAG CCAAGTTGGG CGGTTACGAT CTTGATTTTG 

1301 GACTCGAGGC GGACCAGCTG TTCATCTACC GTGCCGCACT AATACGGCCT 

1351 CCCGTCACGA TCGACCGCGT GGTTTGCGAC TTCGATGTCA CGGGACCTGG 

30 1401 TTCAACCCAG CCCATCCGTG AGCACTATCG GACCCTGCGG CGGCTCTGGG 

1451 ACCTGCATGG CGACTACCCG CTGGGTGGGC GCAGAGTGTC GTGGGCTTAC 

1501 TTGCGTGTGA AGGAGTACTT GATTCGGGCC GACCTGGCCG CATTCAACGC 

1551 GGTAAAGTTC TTGCGAGCGA AGTTCGCCAG AGCTTCGCGG AAGCAAAATT 

1S01 CATAGAAACC AACTTCTACT GCCTGACCTG AGCAGCGCCG AGGCGCGCAG 

35 1651 CGCGATCAGT GCGACCTGAA CGGCCAGGTG GAAAGCGCCA CCGATCCCGG 

1701 CACCGAGTGC CTGACGCTTC GGATCCCTTG CACCACAACG AGAGTGAGAG 

1751 CGCCATGATG AGGAAATATC GGCTGGGCGG AGTCAACGCC GGAGTGACAA 

1801 AAGTGAGAAC CCGGTGAAGC GAGCGCTTAT AACAGGGATC ACGGGGCAGG 

1851 ATGGTTCCTA CCTCGCCGAG CTACTACTGA GCAAGGGATA CGAGGTTCAC 

40 1901 GGGCTCGTTC GTCGAGCTTC GACGTTTAAC ACGTCGCGGA TCGATCACCT 

1951 CTACGTTGAC CCACACCAAC CGGGCGCGCG C7TGTTCTTG CACTATGCAG 

2001 ACCTCACTGA CGGCACCCGG TTGGTGACCC TGCTCAGCAG TATCGACCCG 

2051 GATGAGGTCT ACAACCTCGC AGCGCAGTCC CATGTGCGCG TCAGCTTTGA 

2101 CGAGCCAGTG CATACCGGAG ACACCACCGG CATGGGATCG ATCCGACTTC 

45 2151 TGGAAGCAGT CCGCCTTTCT CGGGTGGACT GCCGGTTCTA TCAGGCTTCC 

2201 TCGTCGGAGA TGTTCGGCGC ATCTCCGCCA CCGCAGAACG AATCGACGCC 

2251 GTTCTATCCC CGTTCGCCAT ACGGCGCGGC CAAGGTCTTC TCGTACTGGA 

2301 CGACTCGCAA CTATCGAGAG GCGTACGGAT TATTCGCAGT GAATGGCATC 

2351 TTGTTCAACC ATGAGTCCCC CCGGCGCGGC GAGACTTTCG TGACCCGAAA 

50 2401 GATCACGCGT GCCGTGGCGC GCATCCGAGC TGGCGTCCAA TCGGAGGTCT 

2451 ATATGGGCAA CCTCGATGCG ATCCGCGACT GGGGCTACGC GCCCGAATAT 

2501 GTCGAGGGGA TGTGGAGGA7 GTTGCAAGCG CCTGAACCTG ATGACTACGT 
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2551 CCTGGCGACA GGGCGTGGTT ACACCGTACG TGAGTTCGCT CAAGCTGCTT 
2601 TTGACCATGT CGGGCTCGAC TGGCAAAAGC GCGTCAAGTT TGACGACCGC 
26 51 TATTTGCGTC CCACCGAGGT CGATTCGCTA GTAGGAGATG CCGACAAGGC 
2701 GGCCCAGTCA CTCGGCTGGA AAGCTTCGGT TCATACTGGT GAACTCGCGC 
5 2751 GCATCATGGT GGACGCGGAC ATCGCCGCGT TGGAGTGCGA TGGCACACCA 

2801 TGGATCGACA CGCCGATGTT GCCTGGTTGG GGCAGAGTAA GTTGACGACT 
2851 ACACCTGGGC CTCTGGACCG CGCAACGCCC GTGTATATCG CCGGTCATCG 
2901 GGGGCTGGTC GGCTCAGCGC TCGTACGTAG ATTTGAGGCC GAGGGGTTCA 
2951 CCAATCTCAT TGTGCGATCA CGCGATGAGA TTGATCTGAC GGACCGAGCC 
10 3001 GCAACGTTTG ATTTTGTGTC TGAGACAAGA CCACAGGTGA TCATCGATGC 

3051 GGCCGCACGG GTCGGCGGCA TCATGGCGAA TAACACCTAT CCCGCGGACT 
3101 TCTTGTCCGA AAACCTCCGA ATCCAGACCA ATTTGCTCGA CGCAGCTGTC 
3151 GCCGTGCGTG TGCCGCGGCT CCTTTTCCTC GGTTCGTCAT GCATCTACCC 
3201 GAAGTACGCT CCGCAACCTA TCCACGAGAG TGCTTTATTG ACTGGCCCTT 
15 3251 TGGAGCCCAC CAACGACGCG TATGCGATCG CCAAGATCGC CGGTATCCTG 

3301 CAAGTTCAGG CGGTTAGGCG CCAATATGGG CTGGCGTGGA TCTCTGCGAT 
3351 GCCGACTAAC CTCTACGGAC CCGGCGACAA CTTCTCCCCG TCCGGGTCGC 
3401 ATCTCTTGCC GGCGCTCATC CGTCGATATG AGGAAGCCAA AGCTGGTGGT 
3451 GCAGAAGAGG TGACGAATTG GGGGACCGGT ACTCCGCGGC GCGAACTTCT 
20 3501 GCATGTCGAC GATCTGGCGA GCGCATGCCT GTTCCTTTTG GAACATTTCG 

3551 ATGGTCCGAA CCACGTCAAC GTGGGCACCG GCGTCGATCA CAGCATTAGC 
3601 GAGATCGCAG ACATGGTCGC TACAGCGGTG GGCTACATCG GCGAAACACG 
3651 TTGGGATCCA ACTAAACCCG ATGGAACCCC GCGCAAACTA TTGGACGTCT 
3701 CCGCGCTACG CGAGTTGGGT TGGCGCCCGC GAATCGCACT GAAAGACGGC 
25 3751 ATCGATGCAA CGGTGTCGTG GTACCGCACA AATGCCGATG CCGTGAGGAG 

3801 GTAAAGCTGC GGGTCGGCCG ATGTTATCCC TCCGGCCGGA CGGGTGGGGC 
3851 GACCTGCCGT CGAGTGGTAC GGCAGTCGCC TGGCCGGCGA GGCGCGTGGC 
3901 CTATGGGAGT ATCCAATAGC CTGGCTTGGC TCGCCCCTAC GCATTATCAG 
3951 TTGACCGCTT TCGCGCCAGC TCGCAGGCTT GCGGCAGCAT CCCGTTCAGG 
30 4001 TCTCCTCATG GTCCGGTGTG GCACGACCAC GCAAGCTCGA ACCGACTCGT 

4051 TTCCCAATTT CGCATGCTAA TATCGCTCGA TGGATTTTTT GCGCAACGCC 
4101 GGCTTGATGG CTCGTAACGT TAGTAGCGAG ATGCTGCGCC ACTTCGAACG 
4151 AAAGCGCCTA TTAGTAAACC AATTCAAAGC ATACGGAGTC AACGTTGTTA 
4201 TTGATGTCGG TGCTAACTCC GGCCAGTTCG GTAGCGCTTT GCGTCGTGCA 
35 4251 GGATTCAAGA GCCGTATCGT TTCCTTTGAA CCTCTTTCGG GGCCATTTGC 

4301 GCAACTAACG CGCAAGTCGG CATCGGATCC ACTATGGGAG TGTCACCAGT 
4351 ATGCCCTAGG CGACGCCGAT GAGACGATTA CCATCAATGT GGCAGGCAAT 
4401 GCGGGGGCAA GTAGTTCCGT GCTGCCGATG CTTAAAAGTC ATCAAGATGC 
4451 CTTTCCTCCC GCGAATTATA TTGGCACCGA AGACGTTGCA ATACACCGCC 
40 4501 TTGATTCGGT TGCATCAGAA TTTCTGAACC CTACCGATGT TACTTTCCTG 

4551 AAGATCGACG TACAGGGTTT CGAGAAGCAG GTTATCACGG GCAGTAAGTC 
4601 AACGCTTAAC GAAAGCTGCG TCGGCATGCA ACTCGAACTT TCTTTTATTC 
4651 CGTTGTACGA AGGTGACATG CTGATTCATG AAGCGCTTGA ACTTGTCTAT 
4701 TCCCTAGGTT TCAGACTGAC GGGTTTGTTG CCCGGCTTTA CGGATCCGCG 
45 4751 CAATGGTCGA ATGCTTCAAG CTGACGGCAT TTTCTTCCGT GGGGACGATT 

4801 GACATAAATG CTCCGTCGGC ACCCTGCCGG TATCCAAACG GGCGATCTGG 
4851 TGAGCCGGCC TCCCGGGCAC CTAATCGACT ATCTAAATTG AGGCGGCCGC 
4901 GACGTGCGGC ACGAACAGGT GGCCGGCTGC TAGCGTTACA CACGTCATGA 
4951 CTGCGCCAGT GTTCTCGATA ATTATCCCTA CCTTCAATGC AGCGGTGACG 
50 5001 CTGCAAGCCT GCCTCGGAAG CATCGTCGGG CAGACCTACC GGGAAGTGGA 

5051 AGTGGTCCTT GTCGACGGCG GTTCGACCGA TCGGACCCTC GACATCGCGA 
5101 ACAGTTTCCG CCCGGAACTC GGCTCGCGAC TGGTCGTTCA CAGCGGGCCC 
5151 GATGATGGCC CCTACGACGC CATGAACCGC GGCGTCGGCG TGGCCACAGG 
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5201 CGAATGGGTA CTTTTTTTAG GCGCCGACGA CACCCTCTAC GAACCAACCA 

5251 CGTTGGCCCA GGTAGCCGCT TTTCTCGGCG ACCATGCGGC AAGCCATCTT 

53 01 GTCTATGGCG ATGTTGTGAT GCGTTCGACG AAAAGCCGGC ATGCCGGACC 

5351 TTTCGACCTC GACCGCCTCC TATTTGAGAC GAATTTGTGC CACCAATCGA 

5401 TGTTTTACCG CCGTGAGCTT TTCGACGGCA TCGGCCCTTA CAACCTGCGC 

5451 TACCGAGTCT GGGCGGACTG GGACTTCAAT ATTCGCTGCT TCTCCAACCC 

5501 GGCGCTGATT ACCCGCTACA TGGACGTCG? GATTTCCGAA TACAACGACA 

5551 TGACCGGCTT CAGCATGAGG CAGGGGACTG ATAAAGAGTT CAGAAAACGG 

5601 CTGCCAATGT ACTTCTGGGT TGCAGGGTGG GAGACTTGCA GGCGCATGCT 

5651 GGCGTTTTTG AAAGACAAGG AGAATCGCCG TCTGGCCTTG CGTACGCGGT 

5701 TGATAAGGGT TAAGGCCGTC TCCAAAGAAC GAAGCGCAGA ACCGTAGTCG 

5751 CGGATCCACA TTGGACTTCT TTAACGCGTT TGCGTCCTGA TCCACCTTTC 

5801 AAGCCCGTTC CGCGTAACGC GGCGCGCAGA GAGTGGTCGC ATATCGCATC 

5851 ACTGTTCTCG TGCCAGTGCT TGGAAAGCGT CGAGCACTCT GGTTCGCGTT 

5901 CTTGACGTTC GCGCCCGCTC CTAGAGGTAG CGTGTCACGT GACTGAAGCC 

5951 AATGAGTGCA ACTCGGCGTC GCGAAAGGTT TCAGTCGCGG TTGAGCAAGA 

6001 CACCGCAAGA CTACTGGAGT GCGTGCACAA GCGCCTCCAG CTCGCGGCTG 

6051 AAAGCGGATG CAAAGGGATT CGAAGCTTGA GCAACATGCG AAGGGGAGAA 

6101 CGGCCTATGA GGCTGGGACA GGTTTTCGAT CCGCGCGCGA ATGCACTGTC 

6151 AATGGCCAAG TAGAAGTCCC CGCTGGTGGC CAGCAGAAGT CCCCACTCCG 

6201 CTGCGGGTGG TTGGCTAATT CTTGGCGGCT CCCTTCTTGT GGTCGGCGTG 

6251 GCGCATCCGG TAGGACTCGC CGGAGGTGAC GACGATGCTG GCGTGGTGCA 

6301 GCAGCCGATC GAGGATGCTG GCGGCGGTGG TGTGCTCGGG CAGGAATCGC 

6351 CCCCATTGTT CGAAGGGCCA ATGCGAGGCG ATGGCCAGGG AGCGGCGCTC 

6401 GTAGCCGGCA GCCACGAGCC GGAACAACAG TTGAGTCCCG GTGTCGTCGA 

6451 GCGGGGCGAA GCCGATCTCG TCCAAGATGA CCAGATCCGC GCGGAGCAGG 

6501 GTGTCGATGA TCTTGCCGAC GGTGTTGTCG GCCAGGCCGC GGTAGAGGAC 

6551 CTCGATCAGG TCGGCGGCGG TGAAGTAGCG GACTTTGAAT CCGGCGTGGA 

6601 CGGCAGCGTG CCCGCAGCCG ATGAGCAGGT GACTTTTGCC CGTACCAGGT 

6651 GGGCCAATGA CCGCCAGGTT CTGTTGTGCC CGAATCCATT CCAGGCTCGA 

6701 CAGGTAGTCG AACGTGGCTG CGGTGATCGA CGATCCGGTG ACGTCGAACC 

6751 CGTCGAGGGT CTTGGTGACC GGGAAGGCTG CGGCCTTGAG ACGGTTGGCG 

6801 GTGTTGGAGG CATCGCGGGC AGCGATCTCG GCCTCAACCA ACGTCCGCAG 

6851 GATCTCCTCC GGTGTCCAGC GTTGCGTCTT GGCGACTTGC AACACCTCGG 

6901 CGGCGTTGCG GCGCACCGTG GCCAGCTTCA ACCGCCGCAG CGCCGCGTCA 

6951 AGGTCAGCAG CCAGCGGTGC CGCCGAGGAC GGTGCCACCG GCTTGGCAGC 

7001 GGTGGTCATG AGGCCGTCCC GTCGGTGGTG TTGATCTTGT AGGCCTCCAA 

7051 CGAGCGGGTC TCGACGGTGG GCAGATCGAG CACGAGTGCG TCGGCGGCGG 

7101 GGCGGGGTTG TGGGGTGCCG GCGCCGGCGG CCAGGATCGA GCGCACGTCG 

7151 GCAGCGCGGA ACCGGCGAAA CGCAACCGCC CGGCGCAGCG CGTCAATCAA 

7201 AGCCTGTTCG CCGTGGGCGG CGCCAAGGCC GAGCAGAATG TCGAGTTCGG 

7251 ATTTCAGTCG GGTGTTGCCG ATCGCAGCAG CACCGACGAG GAACTGCTGC 

7301 GCTTCGGTTC CCAATGCGCA GAATCGTTTC TCTGCTTGGG TTTTCGGGCG 

7351 AGGACCACGC GAGGGTGCGG GTCTGGG7CC GTCGTAGTGT TCATCGAGGA 

7401 TGGACACCTC ACCTGGGCTG ACGAGCTCGT GCTCGGCCAC GATCACACCG 

7451 GTCGCAGGTT CCAACAGGAT CAGGGCGCCA TGATCGACCA CCACCGCCAC 

7501 GGTGGCACCG ACGAGCCGCT GAGGCACCGA GTAACGAGCT GAGCCGTAAC 

7551 GGATGCACGA GAGGCCGTCG ACCTTACGGC GCACCGACCC CGAGCCGATC 

7601 GTCGGCCGCA GCGAGGGCAG CTCCCTCAAG ACGGTGCGCT CGTCAACCAA 

7651 GCGATCGTTG GGCACGGCGC AGATCTCCGA GTGGACCGTG GCATTGACCT 

7701 CGGCGCACCA TAGTTGCGCC TGGGCGTTGA GGGCACGTAG GTCGACCTGC 

7751 TCACCGGCTA ACGCAGCTTC GGTCAGCAGC GGCACCGCAA GGTCGTCCTG 

7801 AGCGTAGCCA CAGAGGTTCT CCACGATGCC CTTCGATTGC GGATCCGCAC 
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7851 CGTGGCAGAA GTCCGGAACG AAGCCATAGT GGGACGCGAA TCGCACATAA 
7901 TCCGGTGTTG GAACAACAAC ATTGGCGACG ACACCACCTT TGAGGCAGCC 
79 51 CATCCGGTCG GCCAGGATCT TGGCCGGAAC CCCACCGATC GCCTC 



Seq. ID No. 4 

1 TTCTACTGCC TGACCTGAGC AGCGCCGAGG CGCGCAGCGC GATCACTGCG ACCTGAATGG 
61 CCAGGTGGAA AGCGCCACCG ATCCCGGCAC CGAGTGCCTG ACGATTCGGA TCCCTTGCAC 
121 CACAACGAGA GTGAGACCGC CATGATGACG AAATATCGGC TGGGCGGAGT CAACGCCGGA 
181 GTGACAAAAG TGAGAACCCG GTGAAGCGAG CGCTTATAAC AGGGATCACG GGGCAGGATG 
241 GTTCCTACCT CGCCGAGCTA CTACTGAGCA AGGGATACGA GGTTCACGGG CTCGTTCGTC 
301 GAGCTTCGAC GTTTAACACG TCGCGGATCG ATCACCTCTA CGTTGACCCA CACCAACCGG 
361 GCGCGCGCTT GTTCTTGCAC TATGCAGACC TCACTGACGG CACCCGGTTG GTGACCCTGC 
421 TCAGCAGTAT CGACCCGGAT GAGGTCTACA ACCTCGCAGC GCAGTCCCAT GTGCGCGTCA 
481 GCTTTGACGA GCCAGTGCAT ACCGGAGACA CCACCGGCAT GGGATCGATC CGACTTCTGG 
541 AAGCAGTCCG CCTTTCTCGG GTGGACTGCC GGTTCTATCA GGCTTCCTCG TCGGAGATGT 
601 TCGGCGCATC TCCGCCACCG CAGAACGAAT CGACGCCGTT CTATCCCCGT TCGCCATACG 
661 GCGCGGCCAA GGTCTTCTCG TACTGGACGA CTCGCAACTA TCGAGAGGCG TACGGATTAT 
721 TCGCAGTGAA TGGCATCTTG TTCAACCATG AGTCCCCCCG GCGCGGCGAG ACTTTCGTGA 
781 CCCGAAAGAT CACGCGTGCC GTGGCGCGCA TCCGAGCTGG CGTCCAATCG GAGGTCTATA 
841 TGGGCAACCT CGATGCGATC CGCGACTGGG GCTACGCGCC CGAATATGTC GAGGGGATGT 
901 GGAGGATGTT GCAAGCGCCT GAACCTGATG ACTACGTCCT GGCGACAGGG CGTGGTTACA 
961 CCGTACGTGA GTTCGCTCAA GCTGCTTTTG ACCACGTCGG GCTCGACTGG CAAAAGCACG 
1021 TCAAGTTTGA CGACCGCTAT TTGCGCCCCA CCGAGGTCGA TTCGCTAGTA GGAGATGCCG 
1081 ACAGGGCGGC CCAGTCACTC GGCTGGAAAG CTTCGGTTCA TACTGGTGAA CTCGCGCGCA 
1141 TCATGGTGGA CGCGGACATC GCCGCGTCGG AGTGCGATGG CACACCATGG ATCGACACGC 
1201 CGATGTTGCC TGGTTGGGGC GGAGTAAGTT GACGACTACA CCTGGGCCTC TGGACCGCGC 
1261 AACGCCCGTG TATATCGCCG GTCATCGGGG GCTGGTCGGC TCAGCGCTCG TACGTAGATT 
1321 TGAGGCCGAG GGGTTCACCA ATCTCATTGT GCGATCACGC GATGAGATTG ATCTGACGGA 
1381 CCGAGCCGCA ACGTTTGATT TTGTGTCTGA GACAAGACCA CAGGTGATCA TCGATGCGGC 
1441 CGCACGGGTC GGCGGCATCA TGGCGAATAA CACCTATCCC GCGGACTTCT TGTCCGAAAA 
1501 CCTCCGAATC CAGACCAATT TGCTCGACGC AGCTGTCGCC GTGCGTGTGC CGCGGCTCCT 
1561 TTTCCTCGGT TCGTCATGCA TCTACCCGAA GTACGCTCCG CAACCTATCC ACGAGAGTGC 
1621 TTTATTGACT GGCCCTTTGG AGCCCACCAA CGACGCGTAT GCGATCGCCA AGATCGCCGG 
1681 TATCCTGCAA GTTCAGGCGG TTAGGCGCCA ATATGGGCTG GCGTGGATCT CTGCGATGCC 
1741 GACTAACCTC TACGGACCCG GCGACAACTT CTCCCCGTCC GGGTCGCATC TCTTGCCGGC 
1801 GCTCATCCGT CGATATGAGG AAGCCAAAGC TGGTGGTGCA GAAGAGGTGA CGAATTGGGG 
1861 GACCGGTACT CCGCGGCGCG AACTTCTGCA TGTCGACGAT CTGGCGAGCG CATGCCTGTT 
1921 CCTTTTGGAA CATTTCGATG GTCCGAACCA CGTCAACGTG GGCACCGGCG TCGATCACAG 
1981 CATTAGCGAG ATCGCAGACA TGGTCGCTAC GGCGGTGGGC TACATCGGCG AAACACGTTG 
2041 GGATCCAACT AAACCCGATG GAACCCCGCG CAAACTATTG GACGTCTCCG CGCTACGCGA 
2101 GTTGGGTTGG CGCCCGCGAA TCGCACTGAA AGACGGCATC GATGCAACGG TGTCGTGGTA 
2161 CCGCACAAAT GCCGATGCCG TGAGGAGGTA AAGCTGCGGG CCGGCCGATG TTATCCCXCC 
2221 GGCCGGACGG GTAGGGCGAC CTGCCATCGA GTGGTACGGC AGTCGCCTGG CCGGCGAGGC 
2281 GCATGGCCTA TGGGAGTATC CCATAGCCTG GCTTGGCTCG CCCCTACGCA TTATCAGTTG 
2341 ACCGCTTTCG CGCCAGCTCG CAGGCTCGCG GCAGCATCCC GTTCAGGTCT CCTCATGGTC 
2401 CGGTGTGGCA CGACCACGCA AGCTCGAACC GACTCGTTTC CCAATTTCGC ATGCTAATAT 
2461 CGCTCGATGG ATTTTTTGCG CAACGCCGGC TTGATGGCTC GTAACGTTAG CACCGAGATG 
2521 CTGCGCCACT TCGAACGAAA GCGCCTATTA GTAAACCAAT TCAAAGCATA CGGAGTCAAC 
2S81 GTTGTTATTG ATGTCGGTGC TAACTCCGGC CAGTTCGGTA GCGCTTTGCG TCGTGCAGGA 
2641 TTCAAGAGCC GTATCGTTTC CTTTGAACCT CTTTCGGGGC CATTTGCGCA ACTAACGCGC 
2701 GAGTCGGCAT CGGATCCACT ATGGGAGTGT CACCAGTATG CCCTAGGCGA CGCCGATGAG 
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276 1 ACGATTACCA TCAATGTGGC AGGCAATGCG GGGGCAAGTA GTTCCGTGCT GCCGATGCTT 
2821 AAAAGTCATC AAGATGCCTT TCCTCCCGCG AATTATATTG GCACCGAAGA CGTTGCAATA 
2881 CACCGCCTTG ATTCGGTTGC ATCAGAATTT CTGAACCCTA CCGATGTTAC TTTCCTGAAG 
2941 ATCGACGTAC AGGGTTTCGA GAAGCAGGTT ATCGCGGGCA GTAAGTCAAC GCTTAACGAA 
5 3001 AGCTGCGTCG GCATGCAACT CGAACTTTCT TTTATTCCGT TGTACGAAGG TGACATGCTG 

3061 ATTCATGAAG CGCTTGAACT TGTCTATTCC CTAGGTTTCA GACTGACGGG TTTGTTGCCC 
3121 GGATTTACGG ATCCGCGCAA TGGTCGAATG CTTCAAGCTG ACGGCATTTT CTTCCGTGGG 
3181 GACGATTGAC ATAAATGCTT GCGTCGGCAC CCTGCCGGTA TCCAAACGGG CGATCTGGTG 
3241 AGCCGGCCTC CCGGGCACCT AATCGACTAT CTAAATTGAG GCGGCCGCGA CGTGCGGCAC 

10 33 01 GAACAGGTGG CCGGCXGCTA GCGTTACACA CGTCATGACT GCGCCAGTGT TCTCGATAAT 

3361 TATCCCTACC TTCAATGCAG CGGTGACGCT GCAAGCCTGC CTCGGAAGCA TCGTCGGGCA 
3421 GACCTACCGG GAAGTGGAAG TGGTCCTTGT CGACGGCGGT TCGACCGATC GGACCCTCGA 
3481 CATCGCGAAC AGTTTCCGCC CGGAACTCGG CTCGCGACTG GTCGTTCACA GCGGGCCCGA 
3541 TGATGGCCCC TACGACGCCA TGAACCGCGG CGTCGGCGTA GCCACAGGCG AATGGGTACT 

15 3601 TTTTTTAGGC GCCGACGACA CCCTCTACGA ACCAACCACG TTGGCCCAGG TAGCCGCTTT 

3661 TCTCGGCGAC CATGCGGCAA GCCATCTTGT CTATGGCGAT GTTGTGATGC GTTCGACGAA 
3721 AAGCCGGCAT GCCGGACCTT TCGACCTCGA CCGCCTCCTA TTTGAGACGA ATTTGTGCCA 
3781 CCAATCGATC TTTTACCGCC GTGAGCTTTT CGACGGCATC GGCCCTTACA ACCTGCGCTA 
3841 CCGAGTCTGG GCGGACTGGG ACTTCAATAT TCGCTGCTTC TCCAACCCGG CGCTGATTAC 

20 3901 CCGCTACATG GACGTCGTGA TTTCCGAATA CAACGACATG ACCGGCT7CA GCATGAGGCA 

3961 GGGGACTGAT AAAGAGTTCA GAAAACGGCT GCCAATGTAC TTCTGGGTTG CAGGGTGGGA 
4021 GACTTGCAGG CGCATGC7GG CGTTTTTGAA AGACAAGGAG AATCGCCGTC TGGCCTTGCG 
4081 TACGCGGTTG ATAAGGGTTA AGGCCGTCTC CAAAGAACGA AGCGCAGAAC CGTAGTCGCG 
4141 GATCCACATT GGACTTCTTT AACGCGTTTG CGTCCTGATC CACCTTTCAA CCCCGTTCCG 

25 4201 CGTGACGCGG CGCGCAGAGA GTGGTCGCAT ATCGCGTCAC TGTTCTCGTG CCAGTGCTTG 

4261 GAAAGCGTCG AGCACTCTGG TTCGCGTTCT TGACGTTCGC GCCCGCCCCT AGAGGTAGCG 
4321 TGTCACGTGA CTGAAGCCAA TGAGTGCAAC TCGGCGTCGC GAAAGGTTTC AGTCGCGGTT 
4381 GAGCAAGACA CCGCAAGACT ACTGGAGTGC GTGCACAAGC GCCTCCAGCT CACGG 



Seq. ID No. 5 

30 1 atgaccgctg tgatctggtc ggcggtgccg acaggaaccg tcgacttgtc gacgatcacc 

61 ttgtaccggt cgatgcatga cccaatgtcg tccgcaaccg agaagacgta cgtcaggtcc 
121 gccgccccgc tttcacccat gggcgtcggg acggcgatga aaatgacgtc cgcgtgctcg 
181 attccgcgtt gccggtcggt ggtgaagtca atcagcccgt tctcacggtt cctcgcaatc 
241 aactcccaac ccgggctcga aaatcgggac actgcctgcg aggagcaaat cgatcttggc 

35 301 ctgatcgata tcgacacaga cgacatcgtt gccgctatcc gcgagacagg cgcccgtgac 

361 gaggcctaca tagcctga 



Seq. ID No. 6 

1 MIAVIWSAVPTGTVDLST2TLYRSMYDPMS 
40 31 SATEKTYVRSAAPLS PMGVGTAMKMTSACS 

61 IPRCRSVVKSISPFSR^F U*A#£Jt[ SQPGLENRD 
91 TACEEQIDLGLID IDTDDIVAAIRETGARD 
121 E A Y I A 
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Seq. ID No. 7 

1 gtgtcatctg ctccaaccgt gtcggcgata acgacctcgc tgaacgatcr cgagggactg 
61 aaaagcaccg tggagagcgt tcgcgcgcag cgctatgggg ggcgaatcga gcacatcgtc 
121 atcgacggtg gaccgggcga cgccgtcgtg gagtatctgt ccggcgatcc tggctttgca 
181 tattggcaat ctcagcccga caacgggaga tatgacgcga tgaatcaggg cattgcccat 
241 tcgtcgggcg acctgttgtg gtttacgcac cccacggacc gtttctccga tccagatgca 
301 gtcgcttccg tggtggaggc gctctcgggg caeggaccag tacgtgactt gtggggttac 
361 gggaaaaaca accttgtcgg actcgacggc aaaccacttt tccctcggcc gtacggctat 
421 atgccgttta agatgcggaa atctctgctc ggcgcgacgg ttgcgcatca ggcgacactc 
4 81 ttcggcgcgt cgctggtagc caagttgggc ggttacgatc ttgattttgg actcgaggcg 
541 gaccagctgt: tcatctaccg tgccgcacta atacggcctc ccgtcacgat cgaccgcgtg 
501 gtttgcgact tcgatgtcac gggacctggt tcaacccagc ccatccgtga gcactatcgg 
661 accctgcggc ggctctggga cctgcatggc gactacccgc cgggtgggcg cagagtgtcg 
721 tgggcttact tgcgtgtgaa ggagtacttg attcgggccg acctggccgc attcaacgcg 
781 gtaaagttct tgcgagcgaa gttcgccaga gcttcgcgga agcaaaattc atag 

Seq. ID No. 8 
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Seq. ID No. 9 

1 gtgaagcgag cgcttataac agggatcacg gggcaggatg gttcctacct cgccgagcta 
61 ctactgagca agggatacga ggttcacggg cccgttcgcc gagcttcgac gtttaacacg 
121 tcgcggatcg atcacctcta cgttgaccca caccaaccgg gcgcgcgctt gttcttgcac 
181 tatgcagacc tcactgacgg cacccggttg gtgaccctgc tcagcagtat cgacccggat 
241 gaggtctaca acctcgcagc gcagtcccat gtgcgcgtca gctttgacga gccagtgcat 
301 accggagaca ccaccggcat gggatcgatc cgactcccgg aagcagtccg cctttctcgg 
361 gtggactgcc ggttctatca ggcttcctcg ccggagatgt tcggcgcatc tccgccaccg 
421 cagaacgaat cgacgccgtt ctatccccgt tcgccatacg gcgcggccaa ggtctrctcg 
481 tactggacga ctcgcaacta tcgagaggcg tacggattat tcgcagtgaa tggcatcttg 
541 ttcaaccatg agtccccccg gcgcggcgag actttcgtga cccgaaagat cacgcgtgcc 
601 gtggcgcgca tccgagctgg cgtccaatcg gaggtctata tgggcaacct cgatgcgatc 
661 cgcgactggg gctacgcgcc cgaatatgtc gaggggatgt ggaggatgtt gcaagcgcct 
721 gaacctgatg actacgtcct ggcgacaggg cgtggttaca ccgtacgtga gtccgctcaa 
781 gctgcttttg accatgtcgg gctcgactgg caaaagcgcg tcaagtttga cgaccgctat 
841 ttgcgtccca ccgaggtcga ttcgccagta ggagatgccg acaaggcggc ccagtcactc 
901 ggctggaaag cttcggttca tactggcgaa ctcgcgcgca tcatggtgga cgcggacatc 
961 gccgcgttgg agtgcgatgg cacaccatgg atcgacacgc cgatgttgcc tggtcggggc 
1021 agagcaagtc ga 
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Seq. ID No. 11 

15 1 gtgaagcgag cgcttataac agggatcacg gggcaggatg gtccctacct cgccgagcta 

61 ctactgagca agggatacga ggttcacggg ctcgttcgtc gagcttcgac gtctaacacg 
121 tcgcggatcg atcacctcta cgttgaccca caccaaccgg gcgcgcgctt gttcttgcac 
181 tatgcagacc tcactgacgg cacccggttg gtgaccctgc tcagcagcat cgacccggat 
241 gaggtctaca acctcgcagc gcagtcccat gtgcgcgtca gctttgacga gccagtgcat 

20 301 accggagaca ccaccggcat gggatcgatc cgacttctgg aagcagcccg cctttctcgg 

361 gtggactgcc ggttctatca ggcttcctcg tcggagacgc tcggcgcatc tccgccaccg 
421 cagaacgaac cgacgccgtt ctacccccgt ccgccatacg gcgcggccaa ggtcctctcg 
481 tactggacga ctcgcaacta tcgagaggcg tacggattat tcgcagtgaa cggcatcttg 
541 ttcaaccacg agtccccccg gcgcggcgag actttcgtga cccgaaagat cacgcgtgcc 

25 601 gtggcgcgca tccgagccgg cgtccaatcg gaggtctata cgggcaacct cgatgcgatc 

661 cgcgactggg gccacgcgcc cgaacatgtc gaggggatgt ggaggatgtt gcaagcgcct 
721 gaacctgatg actacgccct ggcgacaggg cgtggtcaca ccgtacgtga gtccgctcaa 
781 gctgcccttg accacgtcgg gctcgactgg caaaagcacg ccaagtttga cgaccgctat 
841 ttgcgcccca ccgaggccga ttcgctagta ggagatgccg acagggcggc ccagtcactc 

30 901 ggctggaaag cttcggctca tactggtgaa ctcgcgcgca tcatggtgga cgcggacatc 

961 gccgcgtcgg agtgcgacgg cacaccatgg atcgacacgc cgatgttgcc tggttggggc 
1021 ggagtaagct ga 



Seq. ID No. 12 
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Seq. ID No. 13 

1 gtgcgatggc acaccatgga ccgacacgcc gatgttgcct ggttggggca gagtaagttg 
61 acgactacac ctgggcctct, ggaccgcgca acgcccgtgt atatcgccgg tcatcggggg 
121 ctggccggcc cagcgctcgt acgtagattt gaggccgagg ggttcaccaa tcccatcgcg 
181 cgatcacgcg atgagattga tctgacggac cgagccgcaa cgtttgattt tgtgtctgag 
241 acaagaccac aggtgatcat cgatgcggcc gcacgggtcg gcggcatcat ggcgaataac 
301 acctaccccg cggacttctt gtccgaaaac ctccgaatcc agaccaattt gctcgacgca 
361 gctgtcgccg tgcgcgtgcc gcggctcctc Ctcctcggcc cgtcatgcat ctacccgaag 
42X tacgctccgc aacctatcca cgagagcgct ttattgactg gccctttgga gcccaccaac 
481 gacgcgcatg cgatcgccaa gaccgccggt atcctgcaag ttcaggcggt taggcgccaa 
541 tatgggctgg cgtggatctc tgcgatgccg accaacccct acggacccgg cgacaacctc 
501 tccccgcccg ggtcgcatct cttgccggcg ctcatccgcc gatatgagga agccaaagct 
661 ggtggtgcag aagaggtgac gaattggggg accggtactc cgcggcgcga acttctgcat 
721 gtcgacgatc tggcgagcgc atgcctgttc cttttggaac atttcgatgg tccgaaccac 
781 gtcaacgtgg gcaccggcgt cgatcacagc atnagcgaga tcgcagacat ggccgctaca 
841 gcggtgggct acatcggcga aacacgttgg gatccaacta aacccgatgg aaccccgcgc 
901 aaactattgg acgtccccgc gctacgcgag ttgggttggc gcccgcgaat cgcaccgaaa 
961 gacggcaccg atgcaacggt gtcgcggcac cgcacaaacg ccgatgccgt gaggaggcaa 
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Seq. ID No. 15 

1 gtgcgatggc acaccacgga ncgacacgcc gatgttgcct ggntggggcg gagcaagttg 
SI acgaccacac ctgggcccct ggaccgcgca acgcccgtgu atatcgccgg tcatcggggg 
121 ctggtcggct cagcgctcgt acgcagactc gaggccgagg ggttcaccaa tctcattgtg 
181 cgatcacgcg atgagattga tctgacggac cgagccgcaa cgtttgattt tgtgtccgag 
241 acaagaccac aggtgatcat cgatgcggcc gcacgggtcg gcggcatcac ggcgaataac 
301 acctatcccg cggacttctt gcccgaaaac ctccgaatcc agaccaattt gctcgacgca 
351 gctgtcgccg tgcgtgtgcc gcggctcctt ttcctcggtt cgtcatgcat ctacccgaag 
421 tacgctccgc aacctancca cgagagtgct ttattgactg gccctttgga gcccaccaac 
481 gacgcgtatg cgatcgccaa gatcgccggt atcctgcaag ttcaggcggt tzaggcgccaa 
541 tatgggctgg cgcggatctc tgcgatgccg actaacctct acggacccgg cgacaacttc 
601 tccccgtccg ggtcgcatct cttgccggcg ctcacccgtc gatatgagga agccaaagct 
661 ggtggtgcag aagaggtgac gaattggggg accggtactc cgcggcgcga actcctgcat 
721 gtcgacgacc tggcgagcgc atgcctgttc cttttggaac atttcgatgg tccgaaccac 
781 gtcaacgtgg gcaccggcgt cgatcacagc attagcgaga tcgcagacat ggtcgctacg 
841 gcggtgggct acatcggcga aacacgttgg gatccaacta aacccgatgg aaccccgcgc 
901 aaactattgg acgtctccgc gctacgcgag ttgggttggc gcccgcgaat cgcactgaaa 
961 gacggcatcg a£gcaacggt gtcgtggcac cgcacaaatg ccgatgccgt gaggaggtaa 
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WLGRSKLTTTPGPLDRA 
SALVRRFEAEGFTNLIV 
TFDFVSETRPQVI IDAA 
ADFLSENLRIQTNLLDA 
SSCIYPKYAPQPIHESA 
AIAKIAGILQVQAVRRQ 
YGPGDNFSPSGSHLLPA 
EEVTJfWGTGTPRRELLH 
HFDGPNHVNVGTGVDHS 
YIGETRWDPTKPDGTPR 
RPRIALKDGIDATVSWY 



Seq. ID No. 17 

1 atggattttt tgcgcaacgc cggcttgacg gctcgtaacg ttagtaccga gatgctgcgc 
61 cacttcgaac gaaagcgcct attagtaaac caattcaaag catacggagt caacgttgtt 
121 attgatgtcg gtgctaactc cggccagttc ggtagcgcct tgcgtcgtgc aggattcaag 
181 agccgtatcg tttcctttga acctctttcg gggccatttg cgcaactaac gcgcaagtcg 
241 gcatcggatc cactatggga gtgtcaccag tatgccctag gcgacgccga cgagacgatt 
301 accatcaatg tggcaggcaa tgcgggggca agtagttccg tgctgccgat gcttaaaagt 
361 catcaagatg cctttcctcc cgcgaactat attggcaccg aagacgttgc aatacaccgc 
421 cttgattcgg ttgcatcaga atctctgaac cctaccgatg ttactttcct gaagatcgac 
481 gtacagggtt tcgagaagca ggttatcacg ggcagcaagt caacgcttaa cgaaagctgc 
541 gtcggcatgc aactcgaact ttcttttatt ccgttgtacg aaggtgacat gctgattcat 
601 gaagcgcttg aacttgtcta ttccctaggt ctcagaccga cgggtttgtt gcccggcttt 
661 acggatccgc gcaacggtcg aacgcttcaa gctgacggca ttttcttccg tggggacgat 
721 tga 
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Seq. ID No. 18 
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Seq. ID No. 19 

1 atggattttt tgcgcaacgc cggcttgacg gctcgtaacg ttagcaccga gatgctgcgc 
61 cacttcgaac gaaagcgcct attagcaaac caattcaaag catacggagt caacgttgtt 
121 attgatgccg gtgctaactc cggccagttc ggtagcgctt tgcgtcgtgc aggattcaag 
181 agccgtatcg cttcctttga acctctttcg gggccacttg cgcaactaac gcgcgagtcg 
241 gcaccggatc cactatggga gtgtcaccag tatgccctag gcgacgccga tgagacgatt 
301 accatcaatg tggcaggcaa tgcgggggca agtagttccg tgctgccgat gcttaaaagt 
361 catcaagacg cctttcctcc cgcgaattat actggcaccg aagacgttgc aatacaccgc 
421 cttgattcgg ttgcatcaga atttctgaac cctaccgatg ctactttcct gaagatcgac 
481 gtacagggct tcgagaagca ggttatcgcg ggcagtaagt caacgcttaa cgaaagctgc 
541 gtcggcatgc aacccgaact ttctcttatt ccgttgtacg aaggtgacat gctgattcat 
601 gaagcgcttg aactcgtcta ttccccaggc ttcagactga cgggtttgtt gcccggacnt 
661 acggatccgc gcaatggtcg aatgcttcaa gctgacggca ttttcttccg tggggacgac 
721 tga 



Seq. ID No. 20 

1- MDFLRNAGLMARN 

31 QFKAYGVNVVIDV 

61 SRIVSFEPLSGPF 

91 YALGDADETITIN 

121 HQDAFPPANYIGT 

151 PTDVTFLKIDVQG 

181 VGMQLELSFI PLY 

211 FRLTGIiLPGFTDP 



VSTEMLRHFERKRLIiVN 
GANSGQFG3ALRRAGFK 
AQLTRSSASDPLWECHQ 
VAGNAGASS SVLPMLKS 
EDVAIHRLDSVASEFLN 
FEKQVIAGSKSTLNESC 
EGDMLIHEAI*EI#VYSLG 
RNGRMLQADGI FFRGDD 
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Seq. ID No. 21 

1 atgactgcgc cagtgttctc gataattatc cctaccttca atgcagcggt gacgctgcaa 
51 gcctgcctcg gaagcatcgt cgggcagacc taccgggaag tggaagtggt ccttgtcgac 
121 ggcggttcga ccgatcggac cctcgacatc gcgaacagtt tccgcccgga actcggctcg 
181 cgactggtcg ttcacagcgg gcccgatgat ggcccctacg acgccatgaa ccgcggcgtc 
241 ggcgtggcca caggcgaatg ggtacttttt ttaggcgccg acgacaccct ctacgaacca 
301 accacgttgg cccaggtagc cgcttttctc ggcgaccatg cggcaagcca tcttgtctat 
361 ggcgacgttg tgatgcgctc gacgaaaagc cggcatgccg gacctttcga cctcgaccgc 
421 ctcctatttg agacgaattt gtgccaccaa tcgatctttt accgccgtga gcttttcgac 
481 ggcatcggcc cttacaacct gcgctaccga gtctgggcgg actgggactt caatattcgc 
541 tgcttctcca acccggcgct gattacccgc tacatggacg tcgtgatttc cgaatacaac 
601 gacatgaccg gcttcagcat gaggcagggg actgataaag agttcagaaa acggctgcca 
661 atgtacttct gggttgcagg gtgggagact tgcaggcgca tgctggcgtt tttgaaagac 
721 aaggagaatc gccgtctggc cttgcgtacg cggttgataa gggttaaggc cgtctccaaa 
781 gaacgaagcg cagaaccgta g 



Seq. ID No. 22 

1 MTAPVFS 

31 YREVEVV 

€1 RLVVHSG 

91 LGADDTL 

121 G D V V M R S 

1S1 S I F Y R R E 

181 C F S N P A L 

211 T D K E F R K 

241 KEORLA 



I I I PTFNAAVT 
LVDGGSTDRTL 
PDDGPYDAMNR 
YEPTTLAQVAA 
TKSRHAGPFDL 
LFDGIGPYNLR 
ITRYMDVVISE 
RLPMYFWVAGW 
IiRTRLIRVKAV 



LQACLGS IVGQT 
DIANSFRPELGS 
GVGVATGEWVLF 
FLGDHAASHLVY 
DRLLFETNLCHQ 
YRVWADWDFNIR 
YNDMTGFSMRQG 
ETCRRMLAFLKD 
SKERSAEP 



Seq. ID No. 23 

1 atgactgcgc cagtgttctc gataattatc cctaccttca atgcagcggt gacgctgcaa 
61 gcctgcctcg gaagcatcgt cgggcagacc taccgggaag tggaagtggt ccttgtcgac 
121 ggcggttcga ccgatcggac cctcgacatc gcgaacagtt tccgcccgga actcggctcg 
181 cgactggtcg ttcacagcgg gcccgatgat ggcccctacg acgccatgaa ccgcggcgtc 
241 ggcgtagcca caggcgaatg ggtacttttt ttaggcgccg acgacaccct ctacgaacca 
301 accacgttgg cccaggtagc cgcttttctc ggcgaccatg cggcaagcca tcttgtctat 
361 ggcgatgttg tgatgcgttc gacgaaaagc cggcatgccg gacctttcga cctcgaccgc 
421 ctcctatttg agacgaattt gtgccaccaa tcgatctttt accgccgtga gcttttcgac 
481 ggcatcggcc cttacaacct gcgctaccga gtctgggcgg actgggactt caatattcgc 
541 tgcttctcca acccggcgct gattacccgc tacatggacg tcgtgatttc cgaatacaac 
601 gacatgaccg gcttcagcat gaggcagggg actgataaag agttcagaaa acggctgcca 
661 atgtacttct gggttgcagg gtgggagact tgcaggcgca tgctggcgtt tttgaaagac 
721 aaggagaatc gccgtctggc cttgcgtacg cggttgataa gggttaaggc cgtctccaaa 
781 gaacgaagcg cagaaccgta g 



WO 97/23624 



PCT/GB96/03221 



- 51 - 

Seq. ID No. 24 
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Seq. ID No. 25 

l gtggccagca gaagtcccca ctccgctgcg ggtggttggc taattcttgg cggctccctt 
61 cttgcggccg gcgtggcgca tccggtagga ctcgccggag gtigacgacga tgctiggcgtg 
121 gtgcagcagc cgatcgagga tgctggcggc ggtggtgtgc tcgggcagga atcgccccca 
181 ctgttcgaag ggccaatgcg aggcgatggc cagggagcgg cgctcgtagc cggcagccac 
241 gagccggaac aacagttgag tcccggcgtc gtcgagcggg gcgaagccga ccccgtccaa 
301 gatgaccaga tccgcgcgga gcagggtgtc gatgatctrg ccgacggtgt tgtcggccag 
361 gccgcggcag aggacctcga tcaggtcggc ggcggcgaag tagcggactt tgaatccggc 
421 gtggacggca gcgtgcccgc agccgacgag caggtgactt ttgcccgtac caggtgggcc 
481 aatgaccgcc aggttccgtt gtgcccgaat ccattccagg ctcgacaggt agtcgaacgt 
541 ggccgcggtg atcgacgatc cggtgacgtc gaacccgtcg agggtcttgg tgaccgggaa 
601 ggctgcggcc ttgagacggc tggcggcgtt ggaggcatcg cgggcagcga tctcggcctc 
661 aaccaacgcc cgcaggatct cctccggtgt ccagcgttgc gtcttggcga cttgcaacac 
721 ctcggcggcg ttgcggcgca ccgtggccag cttcaaccgc cgcagcgccg cgtcaaggtc 
781 agcagccagc ggtgccgccg aggacggtgc caccggcttg gcagcggtgg ccatgaggcc 
841 gtcccgtcgg tggtgttgac cttgtag 



Seq. ID No. 26 

1 VASRS PHSAAGGW 

31 LAGGDDDAGVVQQ 

61 LFSGPMRGDGQGA 

91 VERGEADLVQDDQ 

121 AAVEDLDQVGGGE 

151 QVTFARTRWANDR 

181 GCGDRRSGDVEPV 

211 GGIAGSDLGLNQR 

241 LGGVAAHRGQLQP 

271 HRLGSGGHSAVPS 



LILGGSLLVVGVAHPVG 
P IEDAGGGGVLGQES PP 
ALVAGSHEPEQQIjSPGV 
IRAEQGVDDLADGVVGQ 
VADFSSGVDGSVPAADE 
QVLLCPNP FQARQVVER 
SGLGDREGCGIiETVGGV 
PQDLLRCPALRLGDLQH 
PQRRVKVSSQRCRRGRC 
V V L I L 



-52- 



ID No. 27 

1 atgggccgcc ccaaaggcgg tgtcgtcgcc aatgttgtng ttccaacacc ggattatgtg 
61 cgattcgcgt cccactacgg cttcgttccg gacttctgcc acggtgcgga tccgcaaccg 
121 aagggcatcg tggagaacct: ccgtggccac gctcaggacg accctgcggt gccgctgctg 
181 accgaagccg cgtcagccgg tgagcaggtc gacctacgtg ccctcaacgc ccaggcgcaa 
241 ctatggcgcg ccgaggtcaa tgccacggtc cactcggaga cctgcgccgt gcccaacgat 
301 cgcttggttg acgagcgcac cgtcttgagg gagctgccct cgctgcggcc gacgaccggc 
361 tcggggtcgg cgcgccgtaa ggtcgacggc ctctcgtgca tccgttacgg ctcagctcgt 
421 tactcggtgc ctcagcggct cgtcggtgcc accgtggcgg tggtggccga tcatggcgcc 
481 ctgatcctgt tggaacctgc gaccggtgtg atcgtggccg agcacgagct cgtcagccca 
541 ggtgaggtgt ccatcctcga tgaacactac gacggaccca gacccgcacc ctcgcgtggt 
601 cctcgcccga aaacccaagc agagaaacga ttctgcgcat tgggaaccga agcgcagcag 
661 ttcctcgtcg gtgctgctgc gatcggcaac acccgactga aatccgaact cgacattctg 
721 ctcggccttg gcgccgccca cggcgaacag gctttgattg acgcgctgcg ccgggcggtt 
781 gcgtttcgcc ggttccgcgc tgccgacgtg cgctcgatcc tggccgccgg cgccggcacc 
841 ccacaacccc gccccgccgg cgacgcactc gtgctcgatc tgcccaccgt cgagacccgc 
901 tcgttggagg cctacaagat caacaccacc gacgggacgg cctcatgacc accgctgcca 
961 agccggtggc accgtcctcg gcggcaccgc cggctgccga ccttgacgcg gcgctgcggc 
1021 ggttgaagct ggccacggtg cgccgcaacg ccgccgaggt gttgcaagtc gccaagacgc 
1081 aacgctggac accggaggag atcctgcgga cgttggttga ggccgagatc gctgcccgcg 
1141 atgcctccaa caccgccaac cgtctcaagg ccgcagcctt cccggtcacc aagaccctcg 
1201 acgggttcga cgccaccgga tcgtcgatca ccgcagccac gttcgactac ctgtcgagcc 
1261 tggaatggat tcgggcacaa cagaacctgg cggccattgg cccacctggt acgggcaaaa 
1321 gtcacctgct catcggctgc gggcacgctg ccgtccacgc cggattcaaa gtccgccact 
1381 tcaccgccgc cgacctgatc gaggtcccct accgcggcct ggccgacaac accgtcggca 
1441 agatcatcga caccctgctc cgcgcggatc cggtcatctt ggacgagatc ggcttcgccc 
1501 cgctcgacga caccgggact caactgttgt tccggctcgt ggctgccggc tacgagcgcc 
1561 gctccctggc catcgccccg cattggcccc tcgaacaatg ggggcgattc ctgcccgagc 
1621 acaccaccgc cgccagcatc cccgatcggc tgctgcacca cgccagcatc gtcgtcacct 
1681 ccggcgagtc ctaccggatg cgccacgccg accacaagaa gggagccgcc aagaattag 



. ID No. 28 
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PTPDYVRFASHYGFVP 
ENLCGYAQDDLAVPLL 
LNAQAQLWCAEVNATV 
ERTVLiRELPSLRPTIG 
RYGSARYSVPQRLVGA 
HPATGVI VAEHSLVSP 
PAPSRGPRPKTQAEKR 
AAAIGHRLKSELDIL 
ALRRAVAFRRFRAADV 
PAGDALVLDLPTVETR 
S 
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Seq. ID No. 29 
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Seq. ID No. 30 

1 gtgacgtctg ctccgaccgt ctcggtgata acgatctcgt tcaacgacct cgacgggttg 
61 cagcgcacgg tgaaaagtgt gcgggcgcaa cgctaccggg gacgcaccga gcacaccgta 
121 atcgacggtg gcagcggcga cgacgtggtg gcatacctgt ccgggtgtga accaggcttc 
181 gcgcattggc agtccgagcc cgacggcggg cggtacgacg cgatgaacca gggcatcgcg 
241 cacgcatcgg gtgatctgtt gtggttcetg cactccgccg atcgtttttc cgggcccgac 
301 gtggtagccc aggccgegga ggcgctatcc ggcaagggac cggtgtccga attgtggggc 
361 cncgggatgg atcgtctcgc cgggctcgat ..qgggtgcg ^ gcccgatacc cttcagcctg 
421 cgcaaattcc tggccggcaa gcaggttgtt ccgcatct^catcgttcct cggatcatcg 
481 ctggcggcca agatcggtgg ctacgacctt gatttcggga tcgccgccga ccaggaattc 
541 atattgcggg ccgcgctggt atgcgagccg gtcacgactc ggtgtgtgct gtgcgagttc 
601 gacaccacgg gcgtcggctc gcaccgggaa ccaagcgcgg tcttcggtga tctgcgccgc 
661 atgggcgacc ttcatcgccg ctacccgttc gggggaaggc gaatatcaca tgcctaccta 
721 cgcggccggg agtcctacgc ctacaacagt cgattctggg aaaacgtctt cacgcgaatg 
781 tcgaaatag 



Seq, ID No. 31 

1 MTSAPTVSVITIS 

31 RYRGRIEHIVIDG 

61 AYWQSEPDGGRYD 

91 HSADRFSGPDVVA 

121 FGMDRLVGLDRVR 

151 PHQAS FFGSSLVA 

181 ILRAALVCEPVTI 

211 PSAVFGDLRRMGD 

241 RGREFYAYNSRFW 



FNDLDGLQRTVKSVRAQ 
GSGDDVVAYLSGCEPGF 
AMNQGIAHASGDLLWFL 
QAVEALSGKGPVSELWG 
GPIPFSLRKFLAGKQVV 
KIGGYDLDFGIAADQEF 
RCVLCSFDTTGVGSKRE 
LHRRYPFGGRRISHAYL 
ENVFTRMSK 
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Seq. ID No. 32 

1 gtgaagcgag cgctcatcac cggaatcacc ggccaggacg gctcgtacct cgccgaactg 
61 ctgctggcca aggggtatga ggctcacggg ctcatccggc gcgcttcgac gtccaacacc 
121 tcgcggatcg atcacctcta cgtcgacccg caccaaccgg gcgcgcggcc gtttctgc^c 
181 tatggtgacc tgatcgacgg aacccggctg gtgaccctgc tgagcaccat cgaacccgac 
24i gaggtgtaca acctggcggc gcagtcacac gtgcgggtga gcttcgacga acccgtgcac 
301 accggtgaca ccaccggcat gggatccatg cgactgctgg aagccgttcg gctctctcgg 
361 gtgcactgcc gcttctatca ggcgtcctcg tcggagacgt tcggcgcctc gccgccaccg 
421 cagaacgagc tgacgccgtt ctacccgcgg tcaccgtacg gcgccgccaa ggtctattcg 
481 tactgggcga cccgcaatta tcgcgaagcg tacggattgt tcgccgttaa cggcatcttg 
541 ttcaatcacg aatcaccgcg gcgcggtgag acgttcgtga cccgaaagat caccagggcc 
601 gtggcacgca tcaaggccgg tatccagtcc gaggtctata tgggcaatct ggatgcggtc 
661 cgcgacxggg ggtacgcgcc cgaatacgtc gaaggcatgc ggcggatgct gcagaccgac 
721 gagcccgacg acttcgtttt ggcgaccggg cgcggtttca ccgcgcgtga gttcgcgcgg 
781 gccgcgttcg agcatgccgg tttggactgg cagcagtacg tgaaattcga ccaacgctat 
841 ctgcggccca ccgaggtgga ttcgctgatc ggcgacgcga ccaaggctgc cgaattgctg 
901 ggctggaggg ctccggtgca cactgacgag ttggcccgga tcatggtcga cgcggacatg 
961 gcggcgccgg agtgcgaagg caagccgcgg atcgacaagc cgatgatcgc cggccggaca 
1021 tga 



Seq. ID 


NO. 33 
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Seq. ID NO. 34 

1 atgaggctgg cccgtcgcgc tcggaacatc ttgcgtcgca acggcatcga ggtgtcgcgc 
61 tactttgccg aactggactg ggaacgcaac ttcttgcgcc aactgcaatc gcatcgggtc 
121 agtgccgtgc tcgatgtcgg ggccaattcg gggcagtacg ccaggggtct gcgcggcgcg 
181 ggcttcgcgg gccgcatcgt ctcgtccgag ccgctgcccg ggccctttgc cgtcttgcag 
241 cgcagcgcct ccacggaccc gttgtgggaa tgccggcgct gtgcgctggg cgatgtcgat 
3 01 ggaaccatct cgaccaacgt cgccggcaac gagggcgcca gcagttccgt cttgccgatc; 
361 tcgaaacgac atcaggacgc ctttccacca gccaaccacg tgggcgccca acgggtgccg 
421 atacatcgac tcgatcccgt ggctgcagac gttctgcggc ccaacgatat tgcgttcttg 
481 aagaccgacg: ttcaaggatt cgagaagcag gtgatcgcgg gtggcgattc aacggtgcae 
541 gaccgatgcg tcggcatgca gctcgagctg cctttccagc cgttgtacga gggtggcatg 
601 ctcatccgcg aggcgcccga tctcgcggat tcgtcgggct ttacgctctc gggattgcaa 
661 cccggtttca ccgacccccg caacggccga atgctgcagg ccgatggcat cttcttccgg 
721 ggcagcgact ga 
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Seq. ID No. 35 
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Seq. ID No. 36 

1 gtgaaatcgt tgaaactcgc tcgtttcatc gcgcgtagcg ccgcctccga ggtttcgcgc 
61 cgctattctg agcgagacct gaagcaccag tttgtgaagc aactcaaatc gcgtcgggta 
121 gatgtcgttt tcgatgtcgg cgccaactca ggacaacacg ccgccggcct ccgccgagca 
181 gcatataagg gccgcattgt ctcgttcgaa ccgctatccg gaccgtttac gatcttggaa 
241 agcaaagcgt caacggatcc actttgggat tgccggcagc atgcgttggg cgattctgat 
301 ggaacggtta cgatcaatat cgcaggaaac gccggccaga gcagttccgt cttgcccatg 
361 ctgaaaagtc atcagaacgc ttttcccccg gcaaactacg tcggtaccca agaggcgtcc 
421 atacatcgac ttgatcccgt ggcgccagaa tttctaggca tgaacggtgt cgcttttctc 
481 aaggtcgacg ttcaaggctt tgaaaagcag gtgctcgccg ggggcaaatc aaccatagat 
541 gaccatcgcg tcggcatgca actcgaactg tccttcctgc cgttgtacga aggtggcatg 
601 ctcattcctg aagccctcga tctcgtgtat tccttgggct tcacgttgac gggattgctg 
661 ccttgtttca ttgatgcaaa taatggtcga atgttgcagg ccgacggcat cttttcccgc 
721 gaggacgatt ga 
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Seq. ID No. 38 

1 atggtgcaga cgaaacgata cgccggcctg accgcagcta acacaaagaa agtcgccatg 

61 gccgcaccaa cgttctcgat catcaccccc acctcgaacg tggctgcggt actgcctgcc 

121 tgcctcgaca gcatcgcccg ccagacctgc ggtgacttcg agctggtact ggtcgacggc 

5 181 ggctcgacgg acgaaaccct cgacaccgcc aacattttcg cccccaacct cggcgagcgg 

241 ttgatcattc atcgcgacac cgaccagggc gtctacgacg ccatgaaccg cggcgtggac 

301 ctggccaccg gaacgtggtt gctctttctg ggcgcggacg acagcctgta cgaggctgac 

361 accctggcgc gggtggccgc cttcattggc gaacacgagc ccagcgatct ggtatatggc 

421 gacgtgatca tgcgctcaac caatttccgc tggggtggcg ccttcgacct cgaccgtctg 

10 481 ttgttcaagc gcaacatctg ccatcaggcg atcttctacc gccgcggact cttcggcacc 

541 atcggtccct acaacctccg ctaccgggtc ctggccgact gggacttcaa tattcgctgc 

601 ttttccaacc cagcgctcgt cacccgctac atgcacgtgg tcgttgcaag ctacaacgaa 

661 ttcggcgggc tcagcaatac gatcgtcgac aaggagtttt tgaagcggct gccgatgtcc 

721 acgagactcg gcataaggct ggtcatagtt ctggcgcgca ggtggccaaa ggtgatcagc 

15 781 agggccatgg taatgcgcac cgtcatttct tggcggcgcc gacgttag 
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Seq 40: 

GATGCCGTGAGGAGGTAAAGCTGC 
Seq 41: 



30 GATACGGCTCTTGAATCCTGCACG 



CLAIMS 



1. A polypeptide in substantially isolated form which 
comprises a sequence selected from the sequences of 
Seq.ID.No: 6, 3, 10, 12, 14 , 16, 13, 20, 22, 24, 26, 23 and 
29, or a polypeptide substantially homologous thereto. 

2. A polypeptide in substantially isolated form which 
comprises a sequence selected from the sequences of 
Seq.ID.No: 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 and 
29 . 

3. A polypeptide which comprises a fragment of a 
polypeptide defined in claim 1 or 2, said fragment 
comprising at least 12 amino acids and an epitope. 

4. A polynucleotide in substantially isolated form which 
encodes a polypeptide according to any one of claims 1 to 
3 . 

5. A polynucleotide in substantially isolated form which 
is capable of selectively hybridizing to Seq.ID.No: 3 or 4 
or a fragment thereof. 

6. A polynucleotide fragment according to claim 5 which 
comprises a sequence selected from the sequences of 
Seq.ID.No; 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25 and 27, 
or a polynucleotide at least 90% homologous thereto. 

7. A polynucleotide in substantially isolated form 
comprising a sequence selected from the sequences of 
Seq.ID.No: 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25 and 27. 

8. A polynucleotide probe which comprises a fragment of 
at least 15 nucleotides of a polynucleotide as defined in 
any one of claims 4 to 7 , optionally carrying a revealing 
label. 
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9. A recombinant vector carrying a polynucleotide as 
defined in any one of claims 4 to 7 . 



10. An antibody capable of binding a polypeptide or 
fragment thereof as defined in any one of claims 1 to 3 . 

11. An antibody capable of binding a polypeptide or 
fragment thereof wherein the polypeptide is a polypeptide 
which comprises a sequence selected from the sequences of 
Seq.ID.No: 31, 33 , 35, 37 and 39 or is a peptide 
substantially homogolous thereto. 

12. A test kit for detecting the presence or absence of a 
pathogenic mycobacterium in a sample which comprises a 
polynucleotide according to any one of claims 4 to 3 , a 
polypeptide according to any one of claims 1 to 3 , a 
polypeptide which comprises a sequence selected from the 
sequences of Seq.ID.No: 31, 33, 35, 37 and 39 or a 
polypeptide substantially homogolous thereto, * or an 
antibody according to, any one of claims 10 or 11. 

13. A method of detecting the presence or absence of 
antibodies in an animal or human, against a pathogenic 
mycobacteria in a sample which comprises: 

(a) providing a polypeptide according to any one of 
claims 1 to 3 or a polypeptide which comprises a 
sequence selected from the sequences of 
Seq.ID.No: 31, 33, 35, 37 and 3 9 or a polypeptide 
substantially homogolous thereto, which 

v^--. comprises an epitope; 

(b) incubating a biological sample with said 
polypeptide under conditions which allow for the 
formation of an antibody-antigen complex; and 

(c) determining whether antibody-antigen complex 
comprising said polypeptide is formed. 

14. A method of detecting the presence or absence of a 
polypeptide according to any one of claims 1 to 3 or a 
polypeptide which comprises a sequence selected from the 
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sequences of Seq.ID.No: 31, 33, 35, 37 and 39 or a 
polypeptide substantially homogolous thereto in a 
biological sample which method which comprises; 

(a) providing an antibody according to any one of 
claims 10 and 11; 

(b) incubating a biological sample with said antibody 
under conditions which allow for the formation of 
an antibody-antigen complex; and 

(c) . determining whether antibody-antigen complex 

comprising said antibody is formed. 

15. A method of detecting the presence or absence of cell 
mediated immune reactivity in an animal or human, to a 
polypeptide according to claims 1 to 3 or a polypeptide 
which comprises a sequence selected from the sequences of 
Seq.ID.No: 31, 33, 35, 37 and 39 or a polypeptide 
substantially homogolous thereto, which method comprises 

(a) providing a polypeptide according to any one of 
claims 1 to 3 or a polypeptide which comprises a 
sequence selected from the sequences of 
Seq.ID.No: 31, 33 , 35, 37 and 39 or a polypeptide 
substantially homogolous thereto, which comprises 
an epitope; 

(b) incubating a cell sample with said polypeptide 
under conditions which allow for a cellular 
immune response such as release of cytokines or 
other mediator or reaction to occur; and 

(c) detecting the presence of said cytokine or 
mediator or cellular response in the incubate. 

16. A pharmaceutical composition comprising a polypeptide 
according to any one of claims 1 to 3 in a suitable carrier 
or diluent. 

17. A Composition according to claim 16 or a composition 
comprising a polypeptide which comprises a sequence 
selected from the sequences of Seq.ID.No: 31, 33, 35, 37 
and 39 or a polypeptide substantially homogolous thereto, 



- 00 - 

for use in the treatment or prevention of diseases caused 
by mycobacteria. 

13. A method of treating or preventing mycobacterial 
disease in an animal or human caused by mycobacteria which 
express a polypeptide according to claims 1 to 3 or a 
polypeptide which comprises a sequence selected from the 
sequences of Seq.ID.Mo: 31, 33, 35, 37 and 3 9 or a 
polypeptide substantially homogolous thereto, which method 
comprises vaccinating or treating an animal or human with 
an effective amount of said polypeptide. 

19. A method of treating or preventing mycobacterial 
diseases in animals or humans caused by mycobacteria 
containing the polynucleotide of Seq.ID.No: 3 or 4 , which 
method comprises vaccinating or treating an animal or human 
with an effective amount of a polynucleotide according to 
claims 4 to 7, a vector according to claim 9 or a 
polynucleotide which encodes a polypeptide which comprises 
a sequence selected from the sequences of Seq.ID.No: 31, 
33 , 35, 37 and 39 or a polypeptide substantially homogolous 
thereto . 

20. A method according to claims 13 or 19 for increasing 
the in vivo susceptibility of mycobacteria to antimicrobial 
drugs . 

21. A normally pathogenic mycobacterium, whose 
pathogenicity is mediated in all or'in part by the presence 
or the expression of a polypeptide as defined in any one of 
claims 1 to 3 or a polypeptide which comprises a sequence 
selected from the sequences of Seq.ID.No: 31, 33, 35, 37 
and 39 or a polypeptide substantially homogolous thereto, 
which mycobacterium harbours an attenuating mutation in a 
gene encoding one of the said polypeptides. 

22. A vaccine comprising a mycobacterium as claimed in 
claim 21. 
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23. A vaccine according to claim 22 wherein 
mycobacteria is selected from Mavs , Mptb and Mtb. 
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<140> 09/091,538 
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<160> 41 
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<210> 1 
<211> 674 
<212> DNA 

<213> Mycobacterium 



<400> 1 
gatccaacta 


aacccgatgg 


aaccccgcgc 


aaactattgg 


acgtctccgc 


getaegcagt 


60 


tgggttggcg 


cccgcgaatc 


gcactgaaag 


agggcatcga 


tgcaacggtg 


tcgtggtacc 


120 


gcacaaatgc 


cgatgccgtg 


aggaggtaaa 


gctgcgggcc 


ggccgatgtt 


atccctccgg 


180 


ccggacgggt 


agggcgacct 


gecatcgagt 


ggtacggcag 


tcgccrggcc 


ggegaggege 


240 


atggcctatg 


tgagtatccc 


atagectgge 


ttggctcgcc 


cctacgcatt 


atcagttgac 


300 


cgctttcgcg 


ccacgtcgca 


ggcttgegge 


agcatcccgt 


tcaggtctcc 


tcatggtccg 


360 


gtgtggcacg 


accacgcaag 


ctcgaaccga 


ctcgtttccc 


aatttegcat 


gctaatatcg 


420 


ctcgatggat 


tttttgcgca 


acgccggctt 


gatggctcgt 


aacgttagca 


ccgagatget 


480 


gcgccactcc 


gaacgaaagc 


gectattagt 


aaaccaagtc 


gaagcatacg 


gagtcaacgt 


540 


tgttattgat 


gtcggtgcta 


actccggcca 


gtteggtage 


getttgegtc 


gtgeaggatt 


600 


caagagccgt 


atcgtttcct 


ttgaacctct 


tteggggeca 


tttgegcaac 


taacgcgcaa 


660 


gtcggcatcg 


gate 










674 



<210> 2 
<211> 674 



<212> DNA 

<213> Mycobacterium 
<400> 2 

gatccgatgc cgacttgcgc gttagttgcg caaatggccc cgaaagaggt tcaaaggaaa 60 
cgatacggct cttgaatcct gcacgacgca aagcgctacc gaactggccg gagttagcac 120 
cgacatcaat aacaacgttg actccgtatg cttcgacttg gtttactaat aggcgctttc 180 
gttcggagtg gcgcagcatc tcggtgctaa cgttacgagc catcaagccg gcgttgcgca 240 
aaaaatccat cgagcgatat tagcatgcga aattgggaaa cgagtcggtt cgagcttgcg 300 
tggtcgtgcc acaccggacc atgaggagac ctgaacggga tgctgccgca agcctgcgac 360 
gtggcgcgaa agcggtcaac tgataatgcg taggggcgag ccaagccagg ctatgggata 420 
ctcacatagg ccatgcgcct cgccggccag gcgactgccg taccactcga tggcaggtcg 480 
ccctacccgt ccggccggag ggataacatc ggccggcccg cagctttacc tcctcacggc 540 
atcggcattt gtgcggtacc acgacaccgt tgcatcgatg ccctctttca gtgcgattcg 600 
cgggcgccaa cccaactgcg tagcgcggag acgtccaata gtttgcgcgg ggttccatcg 660 
ggtttagttg gate 674 

<210> 3 
<211> 7995 
<212> DNA 

<213> Mycobacterium 
<400> 3 

gaattctggg ttggagacga cgtcgaactc ctggtcggtc ttgcttcgaa tgatcgctgt 60 

gatctggtcg gcggtgccga caggaaccgt cgacttgtcg acgatcacct tgtaccggtc 120 

gatgtatgac ccaatgtcgt ccgcaaccga gaagaegtae gtcaggtccg ccgccccgct 180 

ttcacccatg ggcgtcggga eggegatgaa aatgaegtec gcgtgctcga ttccgcgttg 240 

ccggtcggtg gtgaagtcaa tcagcccgtt ctcacggttc ctcgcaatca actcccaacc 300 

egggctcgaa aategggaca ctgcctgcga ggagcaaatc gatcttggee tgatcgatat 360 

cgacacagac gacatcgttg ccgctatccg cgagacaggc gcccgtgacg aggectacat 420 

agcctgatcc gaccaccgaa attttcaaga tgaccccttc aagtccccga teggtcgacg 480 

accatacrgc cgcaactctg taccctccgt gggtaattcg catgtcgcgt tegtaaggag 540 

cagccagcga gtcggggacg ttcggtgaga gagtegcagg actacgaggt tgccggtgcg 600 

atacatcaca gtgttgcgtc tgteggcaac gatgeagcaa gaacccacgg ggcagccctg 660 

aactgcgcgc atgaccggtc cttgtcctgg cacctttgat cggccaccgc ttccatgega 720 

acatgacegg aatccatagc gcgtggtcaa geagegggga ggtagacgtc ggtgtcatct 780 
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gctccaaccg 


tgtcggtgat 


aacgatttcg 


ctgaacgatc 


tcgagggatt 


gaaaagcacc 


840 


gtggagagcg 


ttcgcgcgca 


gcgctatggg 


gggcgaatcg 


agcacatcgt 


catcgacggt 


900 


ggatcgggcg 


acgccgtcgt 


ggagtatctg 


tccggcgatc 


ctggctttgc 


atattggcaa 


960 


tctcagcccg 


acaacgggag 


atatgacgcg 


atgaatcagg 


gcattgccca 


ttcgtcgggc 


1020 


gacctgttgt 


ggtttatgca 


ctccacggat 


cgtttctccg 


atccagargc 


agtcgcttcc 


1080 


gtggrggagg 


cgctctcggg 


gcatggacca 


gtacgtgatt 


tgtggggtta 


cgggaaaaac 


1140 


aaccttgtcg 


gactcgacgg 


caaaccactt 


ttccctcggc 


cgtacggcta 


tatgccgttt 


1200 


aagatgcgga 


aatttctgct 


cggcgcgacg 


grtgcgcatc 


aggcgacatt 


cttcggcgcg 


1260 


tcgctggtag 


ccaagttggg 


cggttacgat 


cttgattttg 


gactcgaggc 


ggaccagctg 


1320 


ttcatctacc 


gtgccgcact 


aatacggcct 


cccgtcacga 


tcgaccgcgt 


ggtttgcgac 


1380 


ttcgatgtca 


cgggacctgg 


ttcaacccag 


cccatccgtg 


agcactatcg 


gaccctgcgg 


1440 


cggctctggg 


acctgcatgg 


cgactacccg 


ctgggtgggc 


gcagagtgtc 


gtgggcttac 


1500 


ttgcgtgtga 


aggagtactt 


gattcgggcc 


gaccrggccg 


cattcaacgc 


ggtaaagttc 


1560 


ttgcgagcga 


agttcgccag 


agcttcgcgg 


aagcaaaatt 


catagaaacc 


aacttctact 


1620 


gcctgacctg 


agcagcgccg 


aggcgcgcag 


cgcgatcagt 


gcgacctgaa 


cggccaggtg 


1680 


gaaagcgcca 


ccgatcccgg 


caccgagtgc 


ctgacgcttc 


ggatcccttg 


caccacaacg 


1740 


agagtgagag 


cgccatgatg 


aggaaatatc 


ggctgggcgg 


agtcaacgcc 


ggagtgacaa 


1800 


aagtgagaac 


ccggtgaagc 


gagcgcttat 


aacagggatc 


acggggcagg 


atggttccta 


1860 


cctcgccgag 


ctactactga 


gcaagggata 


cgaggttcac 


gggctcgttc 


gtcgagcttc 


1920 


gacgtttaac 


acgtcgcgga 


tcgatcacct 


ctacgttgac 


ccacaccaac 


cgggcgcgcg 


1980 


cttgttcttg 


cactatgcag 


acctcactga 


cggcacccgg 


ttggtgaccc 


tgctcagcag 


2040 


tatcgacccg 


gatgaggtct 


acaacctcgc 


agcgcagtcc 


catgtgcgcg 


tcagctttga 


2100 


cgagccagtg 


cataccggag 


acaccaccgg 


catgggatcg 


atccgacttc 


tggaagcagt 


2160 


ccgcctttct 


cgggtggact 


gccggttcta 


tcaggcttcc 


tcgtcggaga 


tgttcggcgc 


2220 


atctccgcca 


ccgcagaacg 


aatcgacgcc 


gttctatccc 


cgttcgccat 


acggcgcggc 


2280 


caaggtcttc 


tcgtactgga 


cgactcgcaa 


ctatcgagag 


gcgtacggat 


tattcgcagt 


2340 


gaatggcatc 


ttgttcaacc 


atgagtcccc 


ccggcgcggc 


gagactttcg 


tgacccgaaa 


2400 


gatcacgcgt 


gccgtggcgc 


gcatccgagc 


tggcgtccaa 


tcggaggtct 


atatgggcaa 


2460 


cctcgatgcg 


atccgcgact 


ggggctacgc 


gcccgaatat 


gtcgagggga 


tgtggaggat 


2520 


gttgcaagcg 


cctgaacctg 


atgactacgt 


cctggcgaca 


gggcgtggtt 


acaccgtacg 


2580 


tgagttcgct 


caagctgctt 


ttgaccatgt 


cgggctcgac 


tggcaaaagc 


gcgtcaagtt 


2640 
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tgacgaccgc tatttgcgtc ccaccgaggt 
ggcccagtca ctcggctgga aagcttcggt 
ggacgcggac atcgccgcgt tggagtgcga 
gcctggttgg ggcagagtaa gttgacgact 
gtgtatatcg ccggtcatcg ggggctggtc 
gaggggttca ccaatctcat tgtgcgatca 
gcaacgtttg attttgtgtc tgagacaaga 
gtcggcggca tcatggcgaa taacacctat 
atccagacca atttgctcga cgcagctgtc 
ggttcgrcat gcatctaccc gaagtacgct 
actggccctt tggagcccac caacgacgcg 
caagttcagg cggttaggcg ccaatatggg 
ctctacggac ccggcgacaa cttctccccg 
cgtcgatatg aggaagccaa agctggtggt 
actccgcggc gcgaacttct gcatgtcgac 
gaacatttcg atggtccgaa ccacgtcaac 
gagatcgcag acatggtcgc tacagcggtg 
actaaacccg atggaacccc gcgcaaacta 
tggcgcccgc gaatcgcact gaaagacggc 
aatgccgatg ccgtgaggag gtaaagctgc 
cgggtggggc gacctgccgt cgagtggtac 
ctatgggagt atccaatagc ctggcttggc 
tcgcgccagc tcgcaggctt gcggcagcat 
gcacgaccac gcaagctcga accgactcgt 
tggatttttt gcgcaacgcc ggcttgatgg 
acttcgaacg aaagcgccta ttagtaaacc 
ttgatgtcgg tgctaactcc ggccagttcg 
gccgtatcgt ttcctttgaa cctctttcgg 
catcggatcc actatgggag tgtcaccagt 
ccatcaatgt ggcaggcaat gcgggggcaa 



cgattcgcta gtaggagatg ccgacaaggc 2700 
tcatactggt gaactcgcgc gcatcatggt 27 60 
tggcacacca tggatcgaca cgccgatgtt 2820 
acacctgggc ctctggaccg cgcaacgccc 2880 
ggctcagcgc tcgtacgtag atttgaggcc 2940 
cgcgatgaga ttgatctgac ggaccgagcc 3000 
ccacaggtga tcatcgatgc ggccgcacgg 3060 
cccgcggact tcttgtccga aaacctccga 3120 
gccgtgcgtg tgccgcggct ccttttcctc 3180 
ccgcaaccta tccacgagag tgctttattg 3240 
tatgcgatcg ccaagatcgc cggtatcctg 3300 
ctggcgtgga tctctgcgat gccgactaac 3360 
tccgggtcgc atctcttgcc ggcgctcatc 3420 
gcagaagagg tgacgaattg ggggaccggt 3480 
gatctggcga gcgcatgcct gttccttttg 3540 
gtgggcaccg gcgtcgatca cagcattagc 3600 
ggctacatcg gcgaaacacg ttgggatcca 3660 
ttggacgtct ccgcgctacg cgagttgggt 3720 
atcgatgcaa cggtgtcgtg gtaccgcaca 3780 
gggtcggccg atgttatccc tccggccgga 3840 
ggcagtcgcc tggccggcga ggcgcgtggc 3900 
tcgcccctac gcattatcag ttgaccgctt 3960 
cccgttcagg tctcctcatg gtccggtgtg 4020 
ttcccaattt cgcatgctaa tatcgctcga 4080 
ctcgtaacgt tagtaccgag atgctgcgcc 414 0 
aattcaaagc atacggagtc aacgttgtta 4200 
gtagcgcttt gcgtcgtgca ggattcaaga 4260 
ggccatttgc gcaactaacg cgcaagtcgg 4320 
atgccctagg cgacgccgat gagacgatta 4380 
gtagttccgt gctgccgatg cttaaaagtc 4440 
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gcagccgatc 


gaggatgcrg 


gcggcggtgg 


cgaagggcca 


atgcgaggcg 


atggccaggg 


ggaacaacag 


ttgagtcccg 


gtgtcgtcga 


ccagatccgc 


gcggagcagg 


gtgtcgatga 


ggtagaggac 


ctcgatcagg 


tcggcggcgg 


cggcagcgtg 


cccgcagccg 


atgagcaggt 


ccgccaggtt 


ctgttgtgcc 


cgaatccatt 


cggtgatcga 


cgatccggtg 


acgtcgaacc 


cggccttgag 


acggttggcg 


gtgttggagg 


acgtccgcag 


gatctcctcc 


ggtgtccagc 


cggcgttgcg 


gcgcaccgtg 


gccagcttca 


ccagcggtgc 


cgccgaggac 


ggtgccaccg 


gtcggtggtg 


ttgatcttgt 


aggcctccaa 


cacgagtgcg 


tcgccggcgg 


ggcggggrtg 


gcgcacgtcg 


gcagcgcgga 


accggcgaaa 


agcctgttcg 


ccgtgggcgg 


cgccaaggcc 


ggtgttgccg 


atcgcagcag 


caccgacgag 


gaatcgrttc 


tctgctrggg 


rzttcgggcg 


gtcgtagtgt 


tcatcgagga 


tggacacctc 


gatcacaccg 


gtcgcaggrt 


ccaacaggat 


ggtggcaccg 


acgagccgct 


gaggcaccga 


gaggccgtcg 


accttacggc 


gcaccgaccc 


ctccctcaag 


acggtgcgct 


cgtcaaccaa 


gtggaccgtg 


gcattgacct 


cggcgcacca 


gtcgacctgc 


tcaccggcta 


acgcagcttc 


agcgtagcca 


cagaggttct 


ccacgatgcc 


gtccggaacg 


aagccatagt 


gggacgcgaa 


attggcgacg 


acaccacctt 


tgaggcagcc 


cccaccgatc 


gcctc 
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tgtgctcggg 


caggaatcgc 


ccccattgtt 


6360 


agcggcgctc 


gtagccggca 


gccacgagcc 


6420 


gcggggcgaa 


gccgatctcg 


tccaagatga 


6480 


tcttgccgac 


ggtgttgtcg 


gccaggccgc 


6540 


tgaagtagcg 


gactttgaat 


ccggcgtgga 


6600 


gacttttgcc 


cgtaccaggt 


gggccaatga 


6660 


ccaggctcga 


caggtagtcg 


aacgtggctg 


6720 


cgtcgagggt 


cttggtgacc 


gggaaggctg 


6780 


catcgcgggc 


agcgatctcg 


gcctcaacca 


6840 


gttgcgtctt 


ggcgacttgc 


aacacctcgg 


6900 


accgccgcag 


cgccgcgtca 


aggtcagcag 


6960 


gcttggcagc 


ggtggtcatg 


aggccgtccc 


7020 


cgagcgggtc 


tcgacggtgg 


gcagatcgag 


7080 


tggggtgccg 


gcgccggcgg 


ccaggatcga 


7140 


cgcaaccgcc 


cggcgcagcg 


cgtcaatcaa 


7200 


gagcagaatg 


tcgagttcgg 


atttcagtcg 


7260 


gaactgctgc 


gcttcggtrc 


ccaatgcgca 


7320 


aggaccacgc 


gagggtgcgg 


grctgggtcc 


7380 


acctgggctg 


acgagctcgt 


gctcggccac 


7440 


cagggcgcca 


tgatcgacca 


ccaccgccac 


7500 


gtaacgagct 


gagccgtaac 


ggatgcacga 


7560 


cgagccgatc 


gtcggccgca 


gcgagggcag 


7620 


gcgatcgttg 


ggcacggcgc 


agatctccga 


7680 


tagttgcgcc 


tgggcgttga 


gggcacgtag 


7740 


ggtcagcagc 


ggcaccgcaa 


ggtcgtcctg 


7800 


cttcgattgc 


ggatccgcac 


cgtggcagaa 


7860 


tcgcacataa 


tccggtgttg 


gaacaacaac 


7920 


catccggtcg 


gccaggatct 


tggccggaac 


7980 



7995 



<211> 4435 
<212> DNA 

<213> Mycobacterium 
<400> 4 

ttctactgcc tgacctgagc agcgccgagg 
ccaggtggaa agcgccaccg atcccggcac 
cacaacgaga gtgagaccgc catgatgacg 
gtgacaaaag tgagaacccg gtgaagcgag 
gttcctacct cgccgagcta ctactgagca 
gagcttcgac gtttaacacg tcgcggatcg 
gcgcgcgctt gttcttgcac tatgcagacc 
tcagcagtat cgacccggat gaggtctaca 
gctttgacga gccagtgcat accggagaca 
aagcagtccg cctttctcgg gtggactgcc 
tcggcgcatc tccgccaccg cagaacgaat 
gcgcggccaa ggtcttctcg tactggacga 
tcgcagtgaa tggcatcttg ttcaaccatg 
cccgaaagat cacgcgtgcc gtggcgcgca 
tgggcaacct cgatgcgatc cgcgactggg 
ggaggatgtt gcaagcgcct gaacctgatg 
ccgtacgtga gttcgctcaa gctgcttttg 
tcaagrttga cgaccgctat ttgcgcccca 
acagggcggc ccagtcactc ggctggaaag 
tcatggtgga cgcggacatc gccgcgtcgg 
cgatgttgcc tggttggggc ggagtaagtt 
aacgcccgtg tatatcgccg gtcatcgggg 
tgaggccgag gggttcacca atctcattgt 
ccgagccgca acgtttgatt ttgtgtctga 
cgcacgggtc ggcggcatca tggcgaataa 
cctccgaatc cagaccaatt tgctcgacgc 
tttcctcggt tcgtcatgca rctacccgaa 
tttattgact ggccctttgg agcccaccaa 



cgcgcagcgc gatcactgcg acctgaatgg 60 
cgagtgcctg acgattcgga tcccttgcac 120 
aaatatcggc tgggcggagt caacgccgga 180 
cgcttataac agggatcacg gggcaggarg 24 0 
agggatacga ggttcacggg ctcgttcgtc 300 
atcacctcta cgttgaccca caccaaccgg 360 
tcactgacgg cacccggttg gtgaccctgc 420 
acctcgcagc gcagtcccat gtgcgcgtca 480 
ccaccggcat gggatcgatc cgacttctgg 540 
ggttctatca ggcttcctcg tcggagatgt 600 
cgacgccgtt ctatccccgt tcgccatacg 660 
ctcgcaacta tcgagaggcg tacggattat 720 
agtccccccg gcgcggcgag acrttcgtga 780 
tccgagcrgg ctgccaatcg gaggtctata 840 
gctacgcgcc cgaatatgtc gaggggatgt 900 
actacgtcct ggcgacaggg cgtggttaca 960 
accacgtcgg gcrcgactgg caaaagcacg 1020 
ccgaggtcga ttcgctagta ggagatgccg 1080 
cttcggttca tactggtgaa ctcgcgcgca 1140 
agtgcgatgg cacaccatgg atcgacacgc 1200 
gacgactaca cctgggcctc tggaccgcgc 1260 
gctggtcggc tcagcgctcg tacgtagatt 1320 
gcgatcacgc gatgagattg atctgacgga 1380 
gacaagacca caggtgatca tcgatgcggc 1440 
cacctatccc gcggacttct tgtccgaaaa 1500 
agctgtcgcc gtgcgtgtgc cgcggctcct 1560 
gtacgctccg caacctatcc acgagagtgc 1620 
cgacgcgtat gcgatcgcca agatcgccgg 1680 



tatcctgcaa gttcaggcgg ttaggcgcca 
gactaacctc tacggacccg gcgacaactt 
gctcatccgt cgatatgagg aagccaaagc 
gaccggtact ccgcggcgcg aacttctgca 
ccttttggaa catttcgatg gtccgaacca 
cattagcgag atcgcagaca tggtcgctac 
ggatccaact aaacccgatg gaaccccgcg 
gttgggrtgg cgcccgcgaa tcgcactgaa 
ccgcacaaat gccgatgccg tgaggaggta 
ggccggacgg gtagggcgac ctgccatcga 
gcatggccta tgggagtatc ccatagcctg 
accgctttcg cgccagctcg caggctcgcg 
cggtgtggca cgaccacgca agctcgaacc 
cgctcgatgg attttttgcg caacgccggc 
ctgcgccact tcgaacgaaa gcgcctatta 
gttgttattg atgtcggtgc taactccggc 
ttcaagagcc gtarcgtttc ctttgaacct 
gagtcggcat cggatccact atgggagtgt 
acgattacca tcaatgtggc aggcaatgcg 
aaaagtcatc aagatgcctt tcctcccgcg 
caccgccttg attcggttgc arcagaattt 
atcgacgtac agggtttcga gaagcaggtt 
agctgcgtcg gcatgcaact cgaactttct 
attcatgaag cgcttgaact tgtctattcc 
ggatttacgg atccgcgcaa tggtcgaatg 
gacgattgac ataaatgctt gcgtcggcac 
agccggcctc ccgggcacct aatcgactat 
gaacaggtgg ccggctgcta gcgttacaca 
tatccctacc ttcaatgcag cggtgacgct 
gacctaccgg gaagtggaag tggtccttgt 
catcgcgaac agtttccgcc cggaactcgg 



atatgggctg gcgtggatct ctgcgatgcc 1740 
ctccccgtcc gggtcgcatc tcttgccggc 1800 
tggtggtgca gaagaggtga cgaattgggg 18 60 
tgtcgacgat ctggcgagcg catgcctgtt 1920 
cgtcaacgtg ggcaccggcg tcgatcacag 1980 
ggcggtgggc tacatcggcg aaacacgttg 2040 
caaactattg gacgtctccg cgctacgcga 2100 
agacggcatc gatgcaacgg tgtcgtggta 2160 
aagctgcggg ccggccgatg ttatccctcc 2220 
gtggtacggc agtcgcctgg ccggcgaggc 2280 
gcttggctcg cccctacgca ttatcagttg 2340 
gcagcatccc gttcaggtct cctcatggtc 2400 
gactcgtttc ccaatttcgc atgctaatat 24 60 
ttgatggctc gtaacgttag caccgagatg 2520 
gtaaaccaat tcaaagcata cggagtcaac 2580 
cagttcggta gcgctttgcg tcgtgcagga 2640 
ctrtcggggc catttgcgca actaacgcgc 2700 
caccagtatg ccctaggcga cgccgatgag 27 60 
ggggcaagta gttccgrgct gccgatgctt 2820 
aattatattg gcaccgaaga cgttgcaata 2880 
ctgaacccta ccgargttac tttcctgaag 2940 
atcgcgggca gtaagtcaac gcttaacgaa 3000 
tttattccgt tgtacgaagg tgacatgctg 3060 
ctaggtttca gactgacggg tttgttgccc 3120 
cttcaagctg acggcatttt cttccgtggg 3180 
cctgccggta tccaaacggg cgatctggtg 3240 
ctaaattgag gcggccgcga cgtgcggcac 3300 
cgtcatgact gcgccagtgt tctcgataat 3360 
gcaagcctgc ctcggaagca tcgtcgggca 3420 
cgacggcggt tcgaccgatc ggaccctcga 34 80 
ctcgcgactg gtcgttcaca gcgggcccga 3540 



9 



tgatggcccc tacgacgcca tgaaccgcgg cgtcggcgta gccacaggcg aatgggtact 3600 
ttttttaggc gccgacgaca ccctctacga accaaccacg ttggcccagg tagccgcttt 3660 
tctcggcgac catgcggcaa gccatcttgt ctatggcgat gttgtgatgc gttcgacgaa 3720 
aagccggcat gccggacctt tcgacctcga ccgcctccta tttgagacga atttgtgcca 3780 
ccaatcgatc ttttaccgcc gtgagctttt cgacggcatc ggcccttaca acctgcgcta 3840 
ccgagtctgg gcggactggg acttcaatat tcgctgcttc tccaacccgg cgctgattac 3900 
ccgctacatg gacgtcgtga tttccgaata caacgacatg accggcttca gcatgaggca 3960 
ggggactgat aaagagttca gaaaacggct gccaatgtac ttctgggttg cagggtggga 4020 
gacttgcagg cgcatgctgg cgtttttgaa agacaaggag aatcgccgtc tggccttgcg 4080 
tacgcggttg ataagggtta aggccgtctc caaagaacga agcgcagaac cgtagtcgcg 4140 
gatccacatt ggacttcttt aacgcgtttg cgtcctgatc cacctttcaa ccccgttccg 4200 
cgtgacgcgg cgcgcagaga gtggtcgcat atcgcgtcac tgttctcgtg ccagtgcttg 4260 
gaaagcgtcg agcactctgg ttcgcgttct tgacgttcgc gcccgcccct agaggtagcg 4320 
tgtcacgtga ctgaagccaa tgagtgcaac tcggcgtcgc gaaaggtttc agtcgcggtt 4380 
gagcaagaca ccgcaagact actggagtgc gtgcacaagc gcctccagct cacgg 4435 

<210> 5 
<211> 378 
<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> (1) . . (375) 

<400> 5 

atg ate get gtg ate tgg teg gcg gtg ccg aca gga acc gtc gac ttg 48 

Met He Ala Val He Trp Ser Ala Val Pro Thr Gly Thr Val Asp Leu 

1 5 io 15 

teg acg ate acc ttg tac egg teg atg tat gac cca atg teg tec gca 96 
Ser Thr He Thr Leu Tyr Arg Ser Met Tyr Asp Pro Met Ser Ser Ala 
20 25 30 

acc gag aag acg tac gtc agg tec gec gec ccg ctt tea ccc atg ggc 144 
Thr Glu Lys Thr Tyr Val Arg Ser Ala Ala Pro Leu Ser Pro Met Gly 
35 40 45 

gtc ggg acg gcg atg aaa atg acg tec gcg tgc teg att ccg cgt tgc 192 
Val Gly Thr Ala Met Lys Met Thr Ser Ala Cys Ser He Pro Ara Cvs 
50 55 6 o 



egg teg gtg gtg aag tea ate age ccg ttc tea egg ttc etc gca ate 
Arg Ser Val Val Lys Ser lie Ser Pro Phe Ser Arg Phe Leu Ala lie 
65 7 0 75 80 



240 



10 



aac tec caa ccc ggg etc gaa aat egg gac act gee tgc gag gag caa 288 
Asn Ser Gin Pro Gly Leu Glu Asn Arg Asp Thr Ala Cys Glu Glu Gin 
85 90 95 

ate gat ctt ggc ctg ate gat ate gac aca gac gac ate gtt gee get 336 
He Asp Leu Gly Leu He Asp He Asp Thr Asp Asp He Val Ala Ala 
100 105 HO 

ate cgc gag aca ggc gec cgt gac gag gec tac ata gee tga 378 
He Arg Glu Thr Gly Ala Arg Asp Glu Ala Tyr He Ala 
115 120 125 



<210> 6 
<211> 125 
<212> PRT 

<213> Mycobacterium 



<400> 6 

Met He Ala Val 
1 

Ser Thr He Thr 
20 

Thr Glu Lys Thr 
35 

Val Gly Thr Ala 
50 

Arg Ser Val Val 
65 

Asn Ser Gin Pro 



lie Asp Leu Gly 
100 

He Arg Glu Thr 
115 



He Trp Ser Ala 
5 

Leu Tyr Arg Ser 



Tyr Val Arg Ser 
40 

Met Lys Met Thr 
55 

Lys Ser lie Ser 

70 

Gly Leu Glu Asn 
85 

Leu lie Asp lie 



Gly Ala Arg Asp 
120 



Val Pro Thr Gly 
10 

Met Tyr Asp Pro 
25 

Ala Ala Pro Leu 



Ser Ala Cys Ser 
60 

Pro Phe Ser Arg 
75 

Arg Asp Thr Ala 
90 

Asp Thr Asp Asp 
105 

Glu Ala Tyr lie 



Thr Val Asp Leu 
15 

Met Ser Ser Ala 
30 

Ser Pro Met Gly 
45 

lie Pro Arg Cys 



Phe Leu Ala lie 
80 

Cys Glu Glu Gin 
95 

lie Val Ala Ala 
110 

Ala 
125 



<210> 7 
<211> 834 
<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> (1) . . (831) 

<400> 7 

gtg tea tct get cca ace gtg teg gtg ata acg att teg ctg aac gat 48 
Val Ser Ser Ala Pro Thr Val Ser Val He Thr lie Ser Leu Asn Asp 
15 10 15 



II 

etc gag gga ttg aaa age acc gtg gag age gtt cgc gcg cag cgc tat 96 

Leu Glu Gly Leu Lys Ser Thr Val Glu Ser Val Arg Ala Gin Arq Tyr 
20 25 30 

ggg ggg cga ate gag cac ate gtc ate gac ggt gga teg ggc gac gee 14 4 
Gly Gly Arg He Glu His He Val He Asp Gly Gly Ser Gly Asp Ala 
35 40 45 

gtc gtg gag tat ctg tec ggc gat cct ggc ttt gca tat tgg caa tct 192 
Val Val Glu Tyr Leu Ser Gly Asp Pro Gly Phe Ala Tyr Trp Gin Ser 
50 55 60 



cag ccc gac aac ggg aga tat gac gcg atg aat cag ggc att gee cat 
Gin Pro Asp Asn Gly Arg Tyr Asp Ala Met Asn Gin Gly He Ala His 
65 70 75 



80 



gac ggc aaa cca ctt ttc cct egg ccg tac ggc tat atg ccg ttt aag 

Asp Gly Lys Pro Leu Phe Pro Arg Pro Tyr Gly Tyr Met Pro Phe Lys 

130 135 140 

atg egg aaa ttt ctg etc ggc gcg acg gtt gcg cat cag gcg aca ttc 

Met Arg Lys Phe Leu Leu Gly Ala Thr Val Ala His Gin Ala Thr Phe 



145 150 155 



160 



ttc ggc gcg teg ctg gta gee aag ttg ggc ggt tac gat ctt gat ttt 
Phe Gly Ala Ser Leu Val Ala Lys Leu Gly Gly Tyr Asp Leu Asp 



165 170 



Phe 



175 



240 



teg teg ggc gac ctg ttg tgg ttt atg cac tec acg gat cgt ttc tec 288 
Ser Ser Gly Asp Leu Leu Trp Phe Met His Ser Thr Asp Arg Phe Ser 
85 90 95 

gat cca gat gca gtc get tec gtg gtg gag gcg etc teg ggg cat gga 336 
Asp Pro Asp Ala Val Ala Ser Val Val Glu Ala Leu Ser Gly His Gly 
100 105 no 

cca gta cgt gat ttg tgg ggt tac ggg aaa aac aac ctt ate gga etc 384 
Pro Val Arg Asp Leu Trp Gly Tyr Gly Lys Asn Asn Leu Val Gly Leu 
H5 120 125 



432 



480 



528 



gga etc gag gcg gac cag ctg ttc ate tac cgt gec gca eta ata egg 576 
Gly Leu Glu Ala Asp Gin Leu Phe He Tyr Arg Ala Ala Leu He Arg 
180 185 190 

cct ccc gtc acg ate gac cgc gtg gtt tgc gac ttc gat gtc acg gga 624 
Pro Pro Val Thr He Asp Arg Val Val Cys Asp Phe Asp Val Thr Gly 
195 200 205 

cct ggt tea acc cag ccc ate cgt gag cac tat egg acc ctg egg egg 672 
Pro Gly Ser Thr Gin Pro lie Arg Glu His Tyr Arg Thr Leu Arg Arq 
210 215 220 

etc tgg gac ctg cat ggc gac tac ccg ctg ggt ggg cgc aga gtg teg 720 
Leu Trp Asp Leu His Gly Asp Tyr Pro Leu Gly Gly Arg Arg Val Ser 
225 230 235 240 

tgg get tac ttg cgt gtg aag gag tac ttg att egg gee gac ctg gec 768 
Trp Ala Tyr Leu Arg Val Lys Glu Tyr Leu He Arg Ala Asp Leu Ala 
245 250 255 



12 



gca ttc aac gcg gta aag ttc ttg cga gcg aag ttc gcc aga get teg 816 
Ala Phe Asn Ala Val Lys Phe Leu Arg Ala Lys Phe Ala Arg Ala Ser 
260 265 270 

egg aag caa aat tea tag 334 
Arg Lys Gin Asn Ser 
275 



<210> 8 
<211> 277 
<212> PRT 

<213> Mycobacterium 
<400> 8 

Val Ser Ser Ala Pro Thr Val Ser Val He Thr He Ser Leu Asn Asp 
15 10 15 

Leu Glu Gly Leu Lys Ser Thr Val Glu Ser Val Arg Ala Gin Arg Tyr 
20 25 30 

Gly Gly Arg He Glu His He Val lie Asp Gly Gly Ser Gly Asp Ala 
35 40 45 

Val Val Glu Tyr Leu Ser Gly Asp Pro Gly Phe Ala Tyr Trp Gin Ser 
50 55 60 

Gin Pro Asp Asn Gly Arg Tyr Asp Ala Met Asn Gin Gly He Ala His 
6 ^ 70 75 80 

Ser Ser Gly Asp Leu Leu Trp Phe Met His Ser Thr Asp Arg Phe Ser 
85 90 95 

Asp Pro Asp Ala Val Ala Ser Val Val Glu Ala Leu Ser Gly His Gly 
100 105 no 

Pro Val Arg Asp Leu Trp Gly Tyr Gly Lys Asn Asn Leu Val Gly Leu 
115 120 125 

Asp Gly Lys Pro Leu Phe Pro Arg Pro Tyr Gly Tyr Met Pro Phe Lys 
130 135 140 

Met Arg Lys Phe Leu Leu Gly Ala Thr Val Ala His Gin Ala Thr Phe 
I 45 150 155 160 

Phe Gly Ala Ser Leu Val Ala Lys Leu Gly Gly Tyr Asp Leu Asp Phe 
165 170 175 

Gly Leu Glu Ala Asp Gin Leu Phe lie Tyr Arg Ala Ala Leu lie Arg 
180 185 190 

Pro Pro Val Thr lie Asp Arg Val Val Cys Asp Phe Asp Val Thr Gly 
195 200 205 

Pro Gly Ser Thr Gin Pro lie Arg Glu His Tyr Arg Thr Leu Arg Arg 
210 215 220 

Leu Trp Asp Leu His Gly Asp Tyr Pro Leu Gly Gly Arg Arg Val Ser 
225 230 235 240 



13 

Trp Ala Tyr Leu Arg Val Lys Glu Tyr Leu He Arg Ala Asp Leu Ala 
245 250 255 

Ala Phe Asn Ala Val Lys Phe Leu Arg Ala Lys Phe Ala Arg Ala Ser 
260 265 270 

Arg Lys Gin Asn Ser 
275 



<210> 9 
<211> 1032 
<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> (1) . . (1029) 



<400> 9 

gtg aag cga gcg ctt ata aca ggg ate acg ggg cag gat ggt tec tac 4 8 

Val Lys Arg Ala Leu He Thr Gly He Thr Gly Gin Asp Gly Ser Tyr 

1 5 10 " 15 

etc gec gag eta eta ctg age aag gga tac gag gtt cac ggg etc gtt 96 

Leu Ala Glu Leu Leu Leu Ser Lys Gly Tyr Glu Val His Gly Leu Val 
20 25 30 

cgt cga get teg acg ttt aac acg teg egg ate gat cac etc tac gtt 144 

Arg Arg Ala Ser Thr Phe Asn Thr Ser Arg He Asp His Leu Tyr Val 
35 40 45 

gac cca cac caa ccg ggc gcg cgc ttg ttc ttg cac rat gca gac etc 192 

Asp Pro His Gin Pro Gly Ala Arg Leu Phe Leu His Tyr Ala Asp Leu 
50 55 60 



act gac ggc acc egg 
Thr Asp Gly Thr Arg 
65 

gag gtc tac aac etc 
Glu Val Tyr Asn Leu 
85 



ttg gtg acc ctg etc age 
Leu Val Thr Leu Leu Ser 
70 75 

gca gcg cag tec cat gtg 
Ala Ala Gin Ser His Val 
90 



agt ate gac ccg gat 240 
Ser He Asp Pro Asp 

80 

cgc gtc age ttt gac 288 
Arg Val Ser Phe Asp 
95 



gag cca gtg cat acc gga gac acc acc ggc atg gga teg ate cga ctt 336 
Glu Pro Val His Thr Gly Asp Thr Thr Gly Met Gly Ser He Arg Leu 
100 105 no 



ctg gaa gca gtc cgc ctt tct egg gtg gac tgc egg ttc tat cag get 384 
Leu Glu Ala Val Arg Leu Ser Arg Val Asp Cys Arg Phe Tyr Gin Ala 
115 120 125 

tec teg teg gag atg ttc ggc gca tct ccg cca ccg cag aac gaa teg 432 
Ser Ser Ser Glu Met Phe Gly Ala Ser Pro Pro Pro Gin Asn Glu Ser 
130 135 140 



acg ccg ttc tat ccc cgt teg cca tac ggc gcg gec aag gtc ttc teg 480 
Thr Pro Phe Tyr Pro Arg Ser Pro Tyr Gly Ala Ala Lys Val Phe Ser 
145 150 155 160 



14 



tac tgg acg act cgc aac tat cga gag gcg tac gga tta ttc gca gtg 528 

Tyr Trp Thr Thr Arg Asn Tyr Arg Glu Ala Tyr Gly Leu Phe Ala Val 
165 170 175 

aat ggc ate ttg ttc aac cat gag tec ccc egg cgc ggc gag act ttc 576 

Asn Gly lie Leu Phe Asn His Glu Ser Pro Arg Arg Gly Glu Thr Phe 
180 185 190 

gtg acc cga aag ate acg cgt gee gtg gcg cgc ate cga get ggc gtc 624 

Val Thr Arg Lys He Thr Arg Ala Val Ala Arg He Arg Ala Gly Val 
195 200 205 

caa teg gag gtc tat atg ggc aac etc gat gcg ate cgc gac tgg ggc 672 

Gin Ser Glu Val Tyr Met Gly Asn Leu Asp Ala He Arg Asp Trp Gly 

210 215 220 

tac gcg ccc gaa tat gtc gag ggg atg tgg agg atg ttg caa gcg cct 720 

Tyr Ala Pro Glu Tyr Val Glu Gly Met Trp Arg Met Leu Gin Ala Pro 
225 230 235 240 

gaa cct gat gac tac gtc ctg gcg aca ggg cgt ggt tac acc gta cgt 768 

Glu Pro Asp Asp Tyr Val Leu Ala Thr Gly Arg Gly Tyr Thr Val Arg 
245 250 255 

gag ttc get caa get get ttt gac cat gtc ggg etc gac tgg caa aag 816 

Glu Phe Ala Gin Ala Ala Phe Asp His Val Gly Leu Asp Trp Gin Lys 
260 265 270 

cgc gtc aag ttt gac gac cgc tat ttg cgt ccc acc gag gtc gat teg 864 

Arg Val Lys Phe Asp Asp Arg Tyr Leu Arg Pro Thr Glu Val Asp Ser 
275 280 285 

eta gta gga gat gec gac aag gcg gec cag tea etc ggc tgg aaa get 912 

Leu Val Gly Asp Ala Asp Lys Ala Ala Gin Ser Leu Gly Trp Lys Ala 

290 295 300 

teg gtt cat act ggt gaa etc gcg cgc ate atg gtg gac gcg gac ate 960 

Ser Val His Thr Gly Glu Leu Ala Arg He Met Val Asp Ala Asp He 
305 310 315 320 

gee gcg ttg gag tgc gat ggc aca cca tgg ate gac acg ccg atg ttg 1008 

Ala Ala Leu Glu Cys Asp Gly Thr Pro Trp He Asp Thr Pro Met Leu 
325 330 335 

cct ggt tgg ggc aga gta agt tga 1032 

Pro Gly Trp Gly Arg Val Ser 
340 



<210> 10 
<211> 343 
<212> PRT 

<213> Mycobacterium 
<400> 10 

Val Lys Arg Ala Leu He Thr Gly lie Thr Gly Gin Asp Gly Ser Tyr 
15 10 15 

Leu Ala Glu Leu Leu Leu Ser Lys Gly Tyr Glu Val His Gly Leu Val 
20 25 30 



15 



Arg Arg Ala Ser Thr Phe Asn Thr Ser Arg He Asp His Leu Tyr Val 
35 40 45 

Asp Pro His Gin Pro Gly Ala Arg Leu Phe Leu His Tyr Ala Asp Leu 
50 55 60 

Thr Asp Gly Thr Arg Leu Val Thr Leu Leu Ser Ser He Asp Pro Asp 
65 70 75 80 

Glu Val Tyr Asn Leu Ala Ala Gin Ser His Val Arg Val Ser Phe Asp 
85 90 95 

Glu Pro Val His Thr Gly Asp Thr Thr Gly Met Gly Ser He Arg Leu 
100 105 110 

Leu Glu Ala Val Arg Leu Ser Arg Val Asp Cys Arg Phe Tyr Gin Ala 
115 120 125 

Ser Ser Ser Glu Met Phe Gly Ala Ser Pro Pro Pro Gin Asn Glu Ser 
130 135 140 

Thr Pro Phe Tyr Pro Arg Ser Pro Tyr Gly Ala Ala Lys Val Phe Ser 
145 150 155 160 

Tyr Trp Thr Thr Arg Asn Tyr Arg Glu Ala Tyr Gly Leu Phe Ala Val 
165 170 175 

Asn Gly He Leu Phe Asn His Glu Ser Pro Arg Arg Gly Glu Thr Phe 
180 185 190 

Val Thr Arg Lys lie Thr Arg Ala Val Ala Arg He Arg Ala Gly Val 
195 200 205 

Gin Ser Glu Val Tyr Met Gly Asn Leu Asp Ala He Arg Asp Trp Gly 
210 215 220 

Tyr Ala Pro Glu Tyr Val Glu Gly Met Trp Arg Met Leu Gin Ala Pro 
225 230 235 240 

Glu Pro Asp Asp Tyr Val Leu Ala Thr Gly Arg Gly Tyr Thr Val Arg 
245 250 255 

Glu Phe Ala Gin Ala Ala Phe Asp His Val Gly Leu Asp Trp Gin Lys 
260 265 270 

Arg Val Lys Phe Asp Asp Arg Tyr Leu Arg Pro Thr Glu Val Asp Ser 
275 280 285 

Leu Val Gly Asp Ala Asp Lys Ala Ala Gin Ser Leu Gly Trp Lys Ala 
290 295 300 

Ser Val His Thr Gly Glu Leu Ala Arg He Met Val Asp Ala Asp He 
305 310 315 320 



Ala Ala Leu Glu Cys Asp Gly Thr Pro Trp He Asp Thr Pro Met Leu 
325 330 335 



Pro Gly Trp Gly Arg Val Ser 
340 
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<210> 11 
<211> 1032 
<212> DNA 

<213> Mycobacterium 

<220> 

<22i> CDS 

<222> (1) . . (1029) 

<400> 11 

gtg aag cga gcg ctt ata aca ggg ate acg ggg cag gat ggt tec tac 48 

Val Lys Arg Ala Leu lie Thr Gly He Thr Gly Gin Asp Gly Ser Tyr 
15 10 15 

etc gec gag eta eta ctg age aag gga tac gag gtt cac ggg etc gtt 96 
Leu Ala Glu Leu Leu Leu Ser Lys Gly Tyr Glu Val His Gly Leu Val 
20 25 30 

cgt cga get teg acg tzt aac acg teg egg ate gat cac etc tac gtt 144 
Arg Arg Ala Ser Thr ?he Asn Thr Ser Arg He Asp His Leu Tyr Val 
35 40 45 

gac cca cac caa ccg ggc gcg cgc ttg ttc ttg cac tat gca gac etc 192 
Asp Pro His Gin Pro Gly Ala Arg Leu Phe Leu His Tyr Ala Asp Leu 
50 55 60 

act gac ggc acc egg zzg gtg acc ctg etc age agt ate gac ccg gat 240 
Thr Asp Gly Thr Arg Leu Val Thr Leu Leu Ser Ser He Asp Pro Asp 
65 " 70 75 80 

gag gtc tac aac etc gca gcg cag tec cat gtg cgc gtc age ttt gac 288 
Glu Val Tyr Asn Leu Ala Ala Gin Ser His Val Arg Val Ser Phe Asp 
85 90 95 

gag cca gtg cat acc gga gac acc acc ggc atg gga teg ate cga ctt 336 
Glu Pro Val His Thr Gly Asp Thr Thr Gly Met Gly Ser He Arg Leu 
100 105 110 

ctg gaa gca gtc cgc ctt net egg gtg gac tgc egg ttc tat cag get 384 
Leu Glu Ala Val Arg Leu Ser Arg Val Asp Cys Arg Phe Tyr Gin Ala 
115 " 120 125 

tec teg teg gag atg zzc ggc gca tct ccg cca ccg cag aac gaa teg 432 
Ser Ser Ser Glu Met Phe Gly Ala Ser Pro Pro Pro Gin Asn Glu Ser 
130 135 140 

acg ccg ttc tat ccc cgt teg cca tac ggc gcg gec aag gtc ttc teg 480 
Thr Pro Phe Tyr Pro Arg Ser Pro Tyr Gly Ala Ala Lys Val Phe Ser 
145 150 155 160 

tac tgg acg act cgc aac tat cga gag gcg tac gga tta ttc gca gtg 528 
Tyr Trp Thr Thr Arg Asn Tyr Arg Glu Ala Tyr Gly Leu Phe Ala Val 
165 170 175 

aat ggc ate ttg ttc aac cat gag tec ccc egg cgc ggc gag act ttc 576 
Asn Gly lie Leu Phe Asn His Glu Ser Pro Arg Arg Gly Glu Thr Phe 
180 185 190 

gtg acc cga aag ate acg cgt gee gtg gcg cgc ate cga get ggc gtc 624 
Val Thr Arg Lys He Thr Arg Ala Val Ala Arg He Arg Ala Gly Val 
195 200 205 



17 



caa teg gag gtc tat atg ggc aac etc gat gcg ate cgc gac tgg ggc 672 

Gin Ser Glu Val Tyr Met Gly Asn Leu Asp Ala He Arg Asp Trp Gly 

210 215 220 

tac gcg ccc gaa tat gtc gag ggg atg tgg agg atg ttg caa gcg cct 720 

Tyr Ala Pro Glu Tyr Val Glu Gly Met Trp Arg Met Leu Gin Ala Pro 

225 230 235 240 

gaa cct gat gac tac gtc ctg gcg aca ggg cgt ggt tac acc gta cgt 7 68 

Glu Pro Asp Asp Tyr Val Leu Ala Thr Gly Arg Gly Tyr Thr Val Arg 
245 250 255 



gag ttc get caa get get ttt gac cac gtc ggg etc gac tgg caa aag 
Glu Phe Ala Gin Ala Ala Phe Asp His Val Gly Leu Asp Trp Gin Lys 
260 265 270 



816 



cac gtc aag ttt gac gac cgc tat ttg cgc ccc acc gag gtc gat teg 864 

His Val Lys Phe Asp Asp Arg Tyr Leu Arg Pro Thr Glu Val Asp Ser 
275 280 285 

eta gta gga gat gee gac agg gcg gec cag tea etc ggc tgg aaa get 912 

Leu Val Gly Asp Ala Asp Arg Ala Ala Gin Ser Leu Gly Trp Lys Ala 

290 295 300 

teg gtt cat act ggt gaa etc gcg cgc ate atg gtg gac gcg gac ate 960 

Ser Val His Thr Gly Glu Leu Ala Arg He Met Val Asp Ala Asp He 
305 310 315 320 

gee gcg teg gag tgc gat ggc aca cca tgg ate gac acg ccg atg ttg 1008 

Ala Ala Ser Glu Cys Asp Gly Thr Pro Trp He Asp Thr Pro Met Leu 
325 330 335 

cct ggt tgg ggc gga gta agt tga 1032 

Pro Gly Trp Gly Gly Val Ser 
340 



<210> 12 
<2il> 343 
<212> PRT 

<213> Mycobacterium 
<400> 12 

Val Lys Arg Ala Leu He Thr Gly He Thr Gly Gin Asp Gly Ser Tyr 
15 10 15 

Leu Ala Glu Leu Leu Leu Ser Lys Gly Tyr Glu Val His Gly Leu Val 
20 25 30 

Arg Arg Ala Ser Thr Phe Asn Thr Ser Arg lie Asp His Leu Tyr Val 
35 40 45 

Asp Pro His Gin Pro Gly Ala Arg Leu Phe Leu His Tyr Ala Asp Leu 
50 55 60 

Thr Asp Gly Thr Arg Leu Val Thr Leu Leu Ser Ser He Asp Pro Asp 
65 70 75 80 

Glu Val Tyr Asn Leu Ala Ala Gin Ser His Val Arg Val Ser Phe Asp 
85 90 95 



18 



Glu Pro Val His Thr Gly Asp Thr Thr Gly Met Gly Ser He Arg Leu 
100 105 110 

Leu Glu Ala Val Arg Leu Ser Arg Val Asp Cys Arg Phe Tyr Gin Ala 
115 120 125 

Ser Ser Ser Glu Met Phe Gly Ala Ser Pro Pro Pro Gin Asn Glu Ser 
130 135 140 

Thr Pro Phe Tyr Pro Arg Ser Pro Tyr Gly Ala Ala Lys Val Phe Ser 
145 150 155 160 

Tyr Trp Thr Thr Arg Asn Tyr Arg Glu Ala Tyr Gly Leu Phe Ala Val 
165 170 175 

Asn Gly He Leu Phe Asn His Glu Ser Pro Arg Arg Gly Glu Thr Phe 
180 185 190 

Val Thr Arg Lys He Thr Arg Ala Val Ala Arg He Arg Ala Gly Val 
195 200 205 

Gin Ser Glu Val Tyr Met Gly Asn Leu Asp Ala He Arg Asp Trp Gly 
210 215 220 

Tyr Ala Pro Glu Tyr Vai Glu Gly Met Trp Arg Met Leu Gin Ala Pro 
225 230 235 240 

Glu Pro Asp Asp Tyr Vai Leu Ala Thr Gly Arg Gly Tyr Thr Val Arg 
245 250 255 

Glu Phe Ala Gin Ala Ala Phe Asp His Val Gly Leu Asp Trp Gin Lys 
260 265 270 

His Val Lys Phe Asp Asp Arg Tyr Leu Arg Pro Thr Glu Val Asp Ser 
275 280 285 

Leu Val Gly Asp Ala Asp Arg Ala Ala Gin Ser Leu Gly Trp Lys Ala 
290 295 300 

Ser Val His Thr Gly Glu Leu Ala Arg He Met Val Asp Ala Asp He 
305 310 315 320 

Ala Ala Ser Glu Cys Asp Gly Thr Pro Trp lie Asp Thr Pro Met Leu 
325 330 335 



Pro Gly Trp Gly Gly Val Ser 
340 



<210> 13 
<211> 1020 
<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> (1) . . (1017) 



19 



<400> 13 

gtg cga tgg cac acc atg gat cga cac gcc gat gtt gcc tgg ttg ggg 

Val Arg Trp His Thr Met Asp Arg His Ala Asp Val Ala Trp Leu Gly 

15 10 15 



48 



cag agt aag ttg acg act aca cct ggg cct ctg gac cgc gca acg ccc 96 
Gin Ser Lys Leu Thr Thr Thr Pro Giy Pro Leu Asp Arg Ala Thr Pro 
20 25 30 

gtg tat ate gcc ggt cat egg ggg ctg gtc ggc tea gcg etc gta cgt 144 
Val Tyr lie Ala Gly His Arg Gly Leu Val Gly Ser Ala Leu Val Arg 
35 40 45 

aga ttt gag gcc gag ggg ttc acc aat etc att gtg cga tea cgc gat 192 
Arg Phe Glu Ala Glu Gly Phe Thr Asn Leu lie Val Arg Ser Arg Asp 
50 55 60 

gag att gat ctg acg gac cga gcc gca acg ttt gat ttt gtg tct gag 240 
Glu He Asp Leu Thr Asp Arg Ala Ala Thr Phe Asp Phe Val Ser Glu 
65 70 75 80 

aca aga cca cag gtg ate ate gat gcg gcc gca egg gtc ggc ggc ate 288 
Thr Arg Pro Gin Val He He Asp Ala Ala Ala Arg Val Gly Gly He 
85 90 95 

atg gcg aat aac acc tat ccc gcg gac ttc ttg tec gaa aac etc cga 336 
Met Ala Asn Asn Thr Tyr Pro Ala Asp Phe Leu Ser Glu Asn Leu Arg 
100 105 110 

ate cag acc aat ttg etc gac gca get gtc gcc gtg cgr gtg ccg egg 384 
lie Gin Thr Asn Leu Leu Asp Ala Ala Val Ala Val Arg Val Pro Arg 
115 120 125 

etc ctt ttc etc ggt teg tea tgc ate tac ccg aag tac get ccg caa 432 
Leu Leu Phe Leu Gly Ser Ser Cys lie Tyr Pro Lys Tyr Ala Pro Gin 
130 ~ 135 140 

cct ate cac gag agt get tta ttg act ggc cct ttg gag ccc acc aac 480 
Pro lie His Glu Ser Ala Leu Leu Thr Gly Pro Leu Glu Pro Thr Asn 
145 150 155 160 



gac gcg tat gcg ate gcc aag arc gcc ggt ate ctg caa gtt cag gcg 
Asp Ala Tyr Ala lie Ala Lys lie Ala Gly He Leu Gin Val Gin Ala 
165 170 175 



528 



gtt agg cgc caa tat ggg ctg gcg tgg ate tct gcg atg ccg act aac 576 

Val Arg Arg Gin Tyr Gly Leu Ala Trp lie Ser Ala Met Pro Thr Asn 
180 185 190 

etc tac gga ccc ggc gac aac ttc tec ccg tec ggg teg cat etc ttg 624 

Leu Tyr Gly Pro Gly Asp Asn Phe Ser Pro Ser Gly Ser His Leu Leu 

195 200 205 

ccg gcg etc ate cgt cga tat gag gaa gcc aaa get ggt ggt gca gaa 672 

Pro Ala Leu He Arg Arg Tyr Glu Glu Ala Lys Ala Gly Gly Ala Glu 

210 215 220 

gag gtg acg aat tgg ggg acc ggt act ccg egg cgc gaa ctt ctg cat 720 

Glu Val Thr Asn Trp Gly Thr Gly Thr Pro Arg Arg Glu Leu Leu His 
225 230 235 240 



20 



gtc gac gat ctg gcg age gca tgc ctg ttc ctt ttg gaa cat ttc gat 768 

Val Asp Asp Leu Ala Ser Ala Cys Leu Phe Leu Leu Glu His Phe Asp 
245 250 255 

ggt ccg aac cac gtc aac gtg ggc acc ggc gtc gat cac age att age 816 

Gly Pro Asn His Val Asn Val Gly Thr Gly Val Asp His Ser He Ser 

260 265 270 



gag ate gca gac atg gtc get aca gcg gtg ggc tac ate ggc gaa aca 
Glu He Ala Asp Met Val Ala Thr Ala Val Gly Tyr He Gly Glu Thr 
275 280 285 

cgt tgg gat cca act aaa ccc gat gga acc ccg cgc aaa eta ttg gac 
Arg Trp Asp Pro Thr Lys Pro Asp Gly Thr Pro Arg Lys Leu Leu Asp 
290 295 300 

gtc tec gcg eta cgc gag ttg ggt tgg cgc ccg cga ate gca ctg aaa 
Val Ser Ala Leu Arg Glu Leu Gly Trp Arg Pro Arg He Ala Leu Lys 
305 310 315 320 

gac ggc ate gat gca acg gtg teg tgg tac cgc aca aat gec gat gee 
Asp Gly He Asp Ala Thr Val Ser Trp Tyr Arg Thr Asn Ala Asp Ala 
325 330 335 

gtg agg agg taa 
Val Arg Arg 



864 



912 



960 



1008 



1020 



<210> 14 
<211> 339 
<212> PRT 

<213> Mycobacterium 



<400> 14 

Val Arg Trp His Thr Met Asp Arg 

1 5 

Gin Ser Lys Leu Thr Thr Thr Pro 
20 

Val Tyr lie Ala Gly His Arg Gly 
35 40 



His Ala Asp Val Ala Trp Leu Gly 
10 15 

Gly Pro Leu Asp Arg Ala Thr Pro 
25 30 

Leu Val Gly Ser Ala Leu Val Arg 
45 



Arg Phe Glu Ala Glu Gly Phe Thr Asn Leu He Val Arg Ser Arg Asp 

50 55 60 

Glu He Asp Leu Thr Asp Arg Ala Ala Thr Phe Asp Phe Val Ser Glu 

65 70 75 80 



Thr Arg Pro Gin Val He He Asp Ala Ala Ala Arg Val Gly Gly He 
85 90 95 

Met Ala Asn Asn Thr Tyr Pro Ala Asp Phe Leu Ser Glu Asn Leu Arg 
100 105 HO 

He Gin Thr Asn Leu Leu Asp Ala Ala Val Ala Val Arg Val Pro Arg 
115 120 125 

Leu Leu Phe Leu Gly Ser Ser Cys He Tyr Pro Lys Tyr Ala Pro Gin 
130 135 140 



21 



Pro lie His Glu Ser Ala Leu Leu Thr Gly Pro Leu Glu Pro Thr Asa 
145 150 155 160 

Asp Ala Tyr Ala lie Ala Lys lie Ala Gly lie Leu Gin Val Gin Ala 
165 170 175 

Val Arg Arg Gin Tyr Gly Leu Ala Trp lie Ser Ala Met Pro Thr Asn 
180 185 190 

Leu Tyr Gly Pro Gly Asp Asn Phe Ser Pro Ser Gly Ser His Leu Leu 
195 200 205 

Pro Ala Leu lie Arg Arg Tyr Glu Glu Ala Lys Ala Gly Gly Ala Glu 
210 215 220 

Glu Val Thr Asn Trp Gly Thr Gly Thr Pro Arg Arg Glu Leu Leu His 
225 230 235 240 

Val Asp Asp Leu Ala Ser Ala Cys Leu Phe Leu Leu Glu His Phe Asp 
245 250 255 

Gly Pro Asn His Val Asn Val Gly Thr Gly Val Asp His Ser lie Ser 
260 265 270 

Glu He Ala Asp Met Val Ala Thr Ala Val Gly Tyr He Gly Glu Thr 
275 280 285 

Arg Trp Asp Pro Thr Lys Pro Asp Gly Thr Pro Arg Lys Leu Leu Asp 
290 295 300 

Val Ser Ala Leu Arg Glu Leu Gly Trp Arg Pro Arg He Ala Leu Lys 
305 310 315 320 

Asp Gly He Asp Ala Thr Val Ser Trp Tyr Arg Thr Asn Ala Asp Ala 
325 330 335 



Val Arg Arg 



<210> 15 
<211> 1020 
<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> (1) . . (1017) 

<400> 15 

gtg cga tgg cac acc atg gat cga cac gcc gat gtt gcc tgg ttg ggg 48 

Val Arg Trp His Thr Met Asp Arg His Ala Asp Val Ala Trp Leu Gly 
15 10 15 



egg agt aag ttg acg act aca cct ggg cct ctg gac cgc gca acg ccc 96 

Arg Ser Lys Leu Thr Thr Thr Pro Gly Pro Leu Asp Arg Ala Thr Pro 

20 25 30 

gtg tat ate gcc ggt cat egg ggg ctg gtc ggc tea gcg etc gta cgt 14 4 

Val Tyr He Ala Gly His Arg Gly Leu Val Gly Ser Ala Leu Val Arg 

35 40 45 



22 



gac gcg tat gcg ate gec aag ate gee ggt ate ctg caa gtt cag gcg 
Asp Ala Tyr Ala lie Ala Lys lie Ala Gly lie Leu Gin Val Gin Ala 
165 170 175 



240 



aga ttt gag gee gag ggg ttc acc aat etc att gtg cga tea cgc gat 192 
Arg Phe Glu Ala Glu Gly Phe Thr Asn Leu He Val Arg Ser Arg Asp 
50 55 60 

gag att gat ctg acg gac cga gec gca acg ttt gat ttt gtg tct gag 
Glu He Asp Leu Thr Asp Arg Ala Ala Thr Phe Asp Phe Val Ser Glu 
65 70 75 80 

aca aga cca cag gtg ate ate gat gcg gec gca egg gtc ggc ggc ate 288 
Thr Arg Pro Gin Val He He Asp Ala Ala Ala Arg Val Gly Gly He 
85 90 95 

atg gcg aat aac acc tat ccc gcg gac ttc ttg tec gaa aac etc cga 336 
Met Ala Asn Asn Thr Tyr Pro Ala Asp Phe Leu Ser Glu Asn Leu Arg 
100 105 110 

ate eag acc aat ttg etc gac gca get gtc gec gtg cgt gtg ccg egg 384 
He Gin Thr Asn Leu Leu Asp Ala Ala Val Ala Val Arg Val Pro Arg 
115 120 125 

etc ctt ttc etc ggt teg tea tgc ate tac ccg aag tac get ccg caa 432 
Leu Leu Phe Leu Gly Ser Ser Cys He Tyr Pro Lys Tyr Ala Pro Gin 
130 135 140 

cct ate cac gag agt get tta ttg act ggc cct ttg gag ccc acc aac 480 
Pro lie His Glu Ser Ala Leu Leu Thr Gly Pro Leu Glu Pro Thr Asn 
145 150 155 160 



528 



gtt agg cgc caa tat ggg ctg gcg tgg ate tct gcg atg ccg act aac 57 6 

Val Arg Arg Gin Tyr Gly Leu Ala Trp lie Ser Ala Met Pro Thr Asn 

180 185 190 

etc tac gga ccc 'ggc gac aac ttc tec ccg tec ggg teg cat etc ttg 624 

Leu Tyr Gly Pro Gly Asp Asn Phe Ser Pro Ser Gly Ser His Leu Leu 

195 " 200 205 

ccg gcg etc ate cgt cga tat gag gaa gec aaa get ggt ggt gca gaa 672 

Pro Ala Leu lie Arg Arg Tyr Glu Glu Ala Lys Ala Gly Gly Ala Glu 

210 215 220 

gag gtg acg aat tgg ggg acc ggt act ccg egg cgc gaa ctt ctg cat 720 

Glu Val Thr Asn Trp Gly Thr Gly Thr Pro Arg Arg Glu Leu Leu His 

225 230 235 240 

gtc gac gat ctg gcg age gca tgc ctg ttc ctt ttg gaa cat ttc gat 768 

Val Asp Asp Leu Ala Ser Ala Cys Leu Phe Leu Leu Glu His Phe Asp 

245 250 255 

ggt ccg aac cac gtc aac gtg ggc acc ggc gtc gat cac age att age 816 

Gly Pro Asn His Val Asn Val Gly Thr Gly Val Asp His Ser lie Ser 

260 265 270 

gag ate gca gac atg gtc get acg gcg gtg ggc tac ate ggc gaa aca 8 64 

Glu lie Ala Asp Met Val Ala Thr Ala Val Gly Tyr He Gly Glu Thr 

275 280 285 



23 



cgt tgg gat cca act aaa ccc gat gga acc ccg cgc aaa eta ttg gac 912 
Arg Trp Asp Pro Thr Lys Pro Asp Gly Thr Pro Arg Lys Leu Leu Asp 
290 295 300 

gtc tec gcg eta cgc gag ttg ggt tgg cgc ccg cga ate gca ctg aaa 960 
Val Ser Ala Leu Arg Glu Leu Gly Trp Arg Pro Arg lie Ala Leu Lys 
305 310 315 320 

gac ggc ate gat gca acg gtg teg tgg tac cgc aca aat gec gat gec 1008 
Asp Gly lie Asp Ala Thr Val Ser Trp Tyr Arg Thr Asn Ala Asp Ala 
325 330 335 

gtg agg agg taa 1020 
Val Arg Arg 



<210> 16 
<211> 339 
<212> PRT 

<213> Mycobacterium 
<40Q> 16 

Val Arg Trp His Thr Met Asp Arg His Ala Asp Val Ala Trp Leu Gly 
15 10 15 

Arg Ser Lys Leu Thr Thr Thr Pro Gly Pro Leu Asp Arg Ala Thr Pro 
20 25 30 

Val Tyr He Ala Gly His Arg Gly Leu Val Gly Ser Ala Leu Val Arg 
35 40 45 

Arg Phe Glu Ala Glu Gly Phe Thr Asn Leu He Val Arg Ser Arg Asp 
50 55 60 

Glu He Asp Leu Thr Asp Arg Ala Ala Thr Phe Asp Phe Val Ser Glu 
65 70 75 80 

Thr Arg Pro Gin Val He He Asp Ala Ala Ala Arg Val Gly Gly He 
85 90 95 

Met Ala Asn Asn Thr Tyr Pro Ala Asp Phe Leu Ser Glu Asn Leu Arg 
100 105 110 

lie Gin Thr Asn Leu Leu Asp Ala Ala Val Ala Val Arg Val Pro Arg 
115 120 125 

Leu Leu Phe Leu Gly Ser Ser Cys He Tyr Pro Lys Tyr Ala Pro Gin 
130 135 140 

Pro He His Glu Ser Ala Leu Leu Thr Gly Pro Leu Glu Pro Thr Asn 
145 150 155 160 

Asp Ala Tyr Ala He Ala Lys lie Ala Gly lie Leu Gin Val Gin Ala 
165 170 175 

Val Arg Arg Gin Tyr Gly Leu Ala Trp lie Ser Ala Met Pro Thr Asn 
180 185 190 



Leu Tyr Gly Pro Gly Asp Asn Phe Ser Pro Ser Gly Ser His Leu Leu 
195 200 205 



24 



Pro Ala Leu lie Arg Arg Tyr Glu Glu Ala Lys Ala Gly Gly Ala Glu 
210 215 220 

Glu Val Thr Asn Trp Gly Thr Gly Thr Pro Arg Arg Glu Leu Leu His 
225 230 235 240 

Val Asp Asp Leu Ala Ser Ala Cys Leu Phe Leu Leu Glu His Phe Asp 
245 250 255 

Gly Pro Asn His Val Asn Val Gly Thr Gly Val Asp His Ser He Ser 
260 265 270 

Glu He Ala Asp Met Val Ala Thr Ala Val Gly Tyr He Gly Glu Thr 
275 280 285 

Arg Trp Asp Pro Thr Lys Pro Asp Gly Thr Pro Arg Lys Leu Leu Asp 
290 295 300 

Val Ser Ala Leu Arg Glu Leu Gly Trp Arg Pro Arg He Ala Leu Lys 
305 310 315 320 

Asp Gly He Asp Ala Thr Val Ser Trp Tyr Arg Thr Asn Ala Asp Ala 
325 330 335 



Val Arg Arg 



<210> 17 
<211> 723 
<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> (1) . . (720) 

<400> 17 

atg gat ttt ttg cgc aac gcc ggc ttg atg get cgt aac gtt agt acc 4 8 
Met Asp Phe Leu Arg Asn Ala Gly Leu Met Ala Arg Asn Val Ser Thr 
15 10 15 



gag atg ctg cgc cac ttc gaa cga aag cgc eta tta gta aac caa ttc 96 

Glu Met Leu Arg His Phe Glu Arg Lys Arg Leu Leu Val Asn Gin Phe 
20 25 30 

aaa gca tac gga gtc aac gtt gtt att gat gtc ggt get aac zee ggc 144 

Lys Ala Tyr Gly Val Asn Val Val He Asp Val Gly Ala Asn Ser Gly 
35 40 45 

cag ttc ggt age get ttg cgt cgt gca gga ttc aag age cgt ate gtt 192 

Gin Phe Gly Ser Ala Leu Arg Arg Ala Gly Phe Lys Ser Arg He Val 
50 55 60 

tec ttt gaa cct ctt teg ggg cca ttt gcg caa eta acg cgc aag teg 240 

Ser Phe Glu Pro Leu Ser Gly Pro Phe Ala Gin Leu Thr Arg Lys Ser 
65 70 75 80 



gca teg gat cca eta tgg gag tgt cac cag tat gcc eta ggc gac gcc 28 8 
Ala Ser Asp Pro Leu Trp Glu Cys His Gin Tyr Ala Leu Gly Asp Ala 
85 90 95 



25 



gat gag acg att acc ate aat gtg gca ggc aat gcg ggg gca agt agt 336 

Asp Glu Thr He Thr He Asn Val Ala Gly Asn Ala Gly Ala Ser Ser 

100 105 110 

tec gtg ctg ccg atg ctt aaa agt cat caa gat gec ttt cct ccc gcg 384 

Ser Val Leu Pro Met Leu Lys Ser His Gin Asp Ala Phe Pro Pro Ala 

115 120 125 

aat tat att ggc acc gaa gac gtt gca ata cac cgc ctt gat teg gtt 432 

Asn Tyr He Gly Thr Glu Asp Val Ala He His Arg Leu Asp Ser Val 

130 135 140 

gca tea gaa ttt ctg aac cct acc gat gtt act ttc ctg aag ate gac 480 

Ala Ser Glu Phe Leu Asn Pro Thr Asp Val Thr Phe Leu Lys He Asp 

145 150 155 160 

gta cag ggt ttc gag aag cag gtt ate acg ggc agt aag tea acg ctt 528 

Val Gin Gly Phe Glu Lys Gin Val He Thr Gly Ser Lys Ser Thr Leu 

165 170 175 

aac gaa age tgc gtc ggc atg caa etc gaa ctt tct ttt att ccg ttg 57 6 

Asn Glu Ser Cys Val Gly Met Gin Leu Glu Leu Ser Phe lie Pro Leu 

180 185 190 

tac gaa ggt gac atg ctg att cat gaa gcg ctt gaa ctt gtc tat tec 624 

Tyr Glu Gly Asp Met Leu He His Glu Ala Leu Glu Leu Val Tyr Ser 

195 200 205 

eta ggt ttc aga ctg acg ggt ttg ttg ccc ggc ttt acg gat ccg cgc 672 

Leu Gly Phe Arg Leu Thr Gly Leu Leu Pro Gly Phe Thr Asp Pro Arg 

210 215 220 

aat ggt cga atg ctt caa get gac ggc att ttc ttc cgt ggg gac gat 720 

Asn Gly Arg Met Leu Gin Ala Asp Gly He Phe Phe Arg Gly Asp Asp 

225 230 235 240 

tga 723 



<210> 18 
<211> 240 
<212> PRT 

<213> Mycobacterium 
<400> 18 

Met Asp Phe Leu Arg Asn Ala Gly Leu Met Ala Arg Asn Val Ser Thr 
15 10 15 

Glu Met Leu Arg His Phe Glu Arg Lys Arg Leu Leu Val Asn Gin Phe 
20 25 30 

Lys Ala Tyr Gly Val Asn Val Val He Asp Val Gly Ala Asn Ser Gly 
35 40 45 

Gin Phe Gly Ser Ala Leu Arg Arg Ala Gly Phe Lys Ser Arg He Val 
50 55 60 

Ser Phe Glu Pro Leu Ser Gly Pro Phe Ala Gin Leu Thr Arg Lys Ser 
65 70 75 80 



26 



Ala Ser Asp Pro Leu Trp Glu Cys His Gin Tyr Ala Leu Gly Asp Ala 
85 90 95 

Asp Glu Thr lie Thr lie Asn Val Ala Gly Asn Ala Gly Ala Ser Ser 
100 105 110 

Ser Val Leu Pro Met Leu Lys Ser His Gin Asp Ala Phe Pro Pro Ala 
115 120 125 

Asn Tyr lie Gly Thr Glu Asp Val Ala lie His Arg Leu Asp Ser Val 
130 135 140 

Ala Ser Glu Phe Leu Asn Pro Thr Asp Val Thr Phe Leu Lys lie Asp 
145 150 155 160 

Val Gin Gly Phe Glu Lys Gin Val lie Thr Gly Ser Lys Ser Thr Leu 
165 170 175 

Asn Glu Ser Cys Val Gly Met Gin Leu Glu Leu Ser Phe lie Pro Leu 
180 185 190 

Tyr Glu Gly Asp Met Leu lie His Glu Ala Leu Glu Leu Val Tyr Ser 
195 200 205 

Leu Gly Phe Arg Leu Thr Gly Leu Leu Pro Gly Phe Thr Asp Pro Arg 
210 215 220 



Asn Gly Arg Met Leu Gin Ala Asp Gly lie Phe Phe Arg Gly Asp Asp 
225 230 235 240 



<210> 19 
<211> 723 
<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> (1) . . (720) 

<400> 19 

atg gat ttt ttg cgc aac gcc ggc ttg atg get cgt aac gtt age acc 48 

Met Asp Phe Leu Arg Asn Ala Gly Leu Met Ala Arg Asn Val Ser Thr 
15 10 15 

gag atg ctg cgc cac ttc gaa cga aag cgc eta tta gta aac caa ttc 96 
Glu Met Leu Arg His Phe Glu Arg Lys Arg Leu Leu Val Asn Gin Phe 
20 25 30 

aaa gca tac gga gtc aac gtt gtt att gat gtc ggt get aac tec ggc 144 
Lys Ala Tyr Gly Val Asn Val Val lie Asp Val Gly Ala Asn Ser Gly 
35 40 45 

cag ttc ggt age get ttg cgt cgt gca gga ttc aag age cgt ate gtt 192 
Gin Phe Gly Ser Ala Leu Arg Arg Ala Gly Phe Lys Ser Arg He Val 
50 55 60 

tec ttt gaa cct ctt teg ggg cca ttt gcg caa eta acg cgc gag teg 240 
Ser Phe Glu Pro Leu Ser Gly Pro Phe Ala Gin Leu Thr Arg Glu Ser 
65 70 75 80 



27 



gca teg gat cca eta tgg gag tgt cac cag tat gec eta ggc gac gee 288 

Ala Ser Asp Pro Leu Trp Glu Cys His Gin Tyr Ala Leu Gly Asp Ala 
85 90 95 

gat gag acg att acc ate aat gtg gca ggc aat gcg ggg gca agt agt 336 

Asp Glu Thr lie Thr lie Asn Val Ala Gly Asn Ala Gly Ala Ser Ser 
100 105 110 

tec gtg ctg ccg atg ctt aaa agt cat caa gat gee ttt cct ccc gcg 384 

Ser Val Leu Pro Met Leu Lys Ser His Gin Asp Ala Phe Pro Pro Ala 

115 120 125 

aat tat att ggc acc gaa gac gtt gca ata cac cgc ctt gat teg gtt 4 32 

Asn Tyr lie Gly Thr Glu Asp Val Ala lie His Arg Leu Asp Ser Val 

130 135 140 

gca tea gaa ttt ctg aac cct acc gat gtt act ttc ctg aag ate gac 480 

Ala Ser Glu Phe Leu Asn Pro Thr Asp Val Thr Phe Leu Lys lie Asp 
145 150 155 160 

gta cag ggt ttc gag aag cag gtt ate gcg ggc agt aag tea acg ctt 528 

Val Gin Gly Phe Glu Lys Gin Val lie Ala Gly Ser Lys Ser Thr Leu 
165 170 175 

aac gaa age tgc gtc ggc atg caa etc gaa ctt tct ttt art ccg ttg 576 

Asn Glu Ser Cys Val Gly Met Gin Leu Glu Leu Ser Phe lie Pro Leu 
180 185 190 

tac gaa ggt gac atg ctg att cat gaa gcg ctt gaa ctt gtc tat tec 624 

Tyr Glu Gly Asp Met Leu lie His Glu Ala Leu Glu Leu Val Tyr Ser 

195 200 205 

eta ggt ttc aga ctg acg ggt ttg ttg ccc gga ttt acg gat ccg cgc 672 

Leu Gly Phe Arg Leu Thr Gly Leu Leu Pro Gly Phe Thr Asp Pro Arg 

210 215 220 

aat ggt cga atg ctt caa get gac ggc att ttc ttc cgt ggg gac gat 720 

Asn Gly Arg Met Leu Gin Ala Asp Gly lie Phe Phe Arg Gly Asp Asp 
225 230 235 240 

tga 723 



<210> 20 
<211> 240 
<212> PRT 

<213> Mycobacterium 
<400> 20 

Met Asp Phe Leu Arg Asn Ala Gly Leu Met Ala Arg Asn Val Ser Thr 
15 10 15 

Glu Met Leu Arg His Phe Glu Arg Lys Arg Leu Leu Val Asn Gin Phe 
20 25 30 

Lys Ala Tyr Gly Val Asn Val Val He Asp Val Gly Ala Asn Ser Gly 
35 40 45 

Gin Phe Gly Ser Ala Leu Arg Arg Ala Gly Phe Lys Ser Arg He Val 
50 55 60 



28 



Ser Phe Glu Pro Leu Ser Gly Pro Phe Ala Gin Leu Thr Arg Glu Ser 
65 70 75 80 

Ala Ser Asp Pro Leu Trp Glu Cys His Gin Tyr Ala Leu Gly Asp Ala 
8 5 90 95 

Asp Glu Thr lie Thr lie Asn Val Ala Gly Asn Ala Gly Ala Ser Ser 
100 105 110 

Ser Val Leu Pro Met Leu Lys Ser His Gin Asp Ala Phe Pro Pro Ala 
115 120 125 

Asn Tyr lie Gly Thr Glu Asp Val Ala lie His Arg Leu Asp Ser Val 
130 135 140 

Ala Ser Glu Phe Leu Asn Pro Thr Asp Val Thr Phe Leu Lys lie Asp 
145 150 155 160 

Val Gin Gly Phe Glu Lys Gin Val lie Ala Gly Ser Lys Ser Thr Leu 
165 170 175 

Asn Glu Ser Cys Val Gly Met Gin Leu Glu Leu Ser Phe lie Pro Leu 
180 185 190 

Tyr Glu Gly Asp Met Leu lie His Glu Ala Leu Glu Leu Val Tyr Ser 
195 200 205 

Leu Gly Phe Arg Leu Thr Gly Leu Leu Pro Gly Phe Thr Asp Pro Arg 
210 215 220 

Asn Gly Arg Met Leu Gin Ala Asp Gly He Phe Phe Arg Gly Asp Asp 
225 230 235 240 



<210> 21 
<211> 801 
<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> {1} . . (798) 

<400> 21 

atg act gcg cca gtg ttc teg ata att ate cct acc ttc aat gca gcg 48 

Met Thr Ala Pro Val Phe Ser He He He Pro Thr Phe Asn Ala Ala 

15 10 15 

gtg acg ctg caa gec tgc etc gga age ate gtc ggg cag acc tac egg 96 
Val Thr Leu Gin Ala Cys Leu Gly Ser lie Val Gly Gin Thr Tyr Arg 
20 25 30 

gaa gtg gaa gtg gtc ctt gtc gac ggc ggt teg acc gat egg acc etc 144 
Glu Val Glu Val Val Leu Val Asp Gly Gly Ser Thr Asp Arg Thr Leu 
35 40 45 

gac ate gcg aac agt ttc cgc ccg gaa etc ggc teg cga ctg gtc gtt 192 
Asp He Ala Asn Ser Phe Arg Pro Glu Leu Gly Ser Arg Leu Val Val 
50 55 60 



29 



cac age ggg ccc gat gat ggc ccc tac gac gec atg aac cgc ggc gtc 24 0 
His Ser Gly Pro Asp Asp Gly Pro Tyr Asp Ala Met Asn Arg Gly Val 
65 70 75 80 

ggc gtg gec aca ggc gaa tgg gta ctt ttt tta ggc gec gac gac acc 288 
Gly Val Ala Thr Gly Glu Trp Val Leu Phe Leu Gly Ala Asp Asp Thr 
85 90 95 

etc tac gaa cca acc acg ttg gec cag gta gec get ttt etc ggc gac 336 
Leu Tyr Glu Pro Thr Thr Leu Ala Gin Val Ala Ala Phe Leu Gly Asp 
100 105 110 

cat gcg gca age cat ctt gtc tat ggc gat gtt gtg atg cgt teg acg 384 
His Ala Ala Ser His Leu Val Tyr Gly Asp Val Val Met Arg Ser Thr 
115 120 125 

aaa age egg cat gec gga cct ttc gac etc gac cgc etc eta ttt gag 432 
Lys Ser Arg His Ala Gly Pro Phe Asp Leu Asp Arg Leu Leu Phe Glu 
130 135 140 

acg aat ttg tgc cac caa teg ate ttt tac cgc cgt gag ctt ttc gac 4 80 
Thr Asn Leu Cys His Gin Ser lie Phe Tyr Arg Arg Glu Leu Phe Asp 
145 150 155 160 

ggc ate ggc cct tac aac ctg cgc tac cga gtc tgg gcg gac tgg gac 528 
Gly lie Gly Pro Tyr Asn Leu Arg Tyr Arg Val Trp Ala Asp Trp Asp 
165 170 175 

ttc aat att cgc tgc ttc tec aac ccg gcg ctg att acc cgc tac atg 576 
Phe Asn lie Arg Cys Phe Ser Asn Pro Ala Leu lie Thr Arg Tyr Met 
180 185 190 

gac gtc gtg att tec gaa tac aac gac atg acc ggc ttc age atg agg 624 
Asp Val Val lie Ser Glu Tyr Asn Asp Met Thr Gly Phe Ser Met Arg 
195 200 " 205 

cag ggg act gat aaa gag ttc aga aaa egg ctg cca atg tac ttc tgg 672 
Gin Gly Thr Asp Lys Glu Phe Arg Lys Arg Leu Pro Met Tyr Phe Trp 
210 215 220 

gtt gca ggg tgg gag act tgc agg cgc atg ctg gcg ttt ttg aaa gac 720 
Val Ala Gly Trp Glu Thr Cys Arg Arg Met Leu Ala Phe Leu Lys Asp 
225 230 235 240 

aag gag aat cgc cgt ctg gee ttg cgt acg egg ttg ata agg gtt aag 7 68 
Lys Glu Asn Arg Arg Leu Ala Leu Arg Thr Arg Leu lie Arg Val Lys 
245 250 255 

gec gtc tec aaa gaa cga age gca gaa ccg tag 801 
Ala Val Ser Lys Glu Arg Ser Ala Glu Pro 
260 265 



<210> 22 
<2U> 266 
<212> PRT 

<213> Mycobacterium 



<400> 22 

Met Thr Ala Pro Val Phe Ser He He He Pro Thr Phe Asn Ala Ala 
15 10 15 
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Val Thr Leu Gin Ala Cys Leu Gly Ser lie Val Gly Gin Thr Tyr Arg 
20 25 30 

Glu Val Glu Val Val Leu Val Asp Gly Gly Ser Thr Asp Arg Thr Leu 
35 40 45 

Asp lie Ala Asn Ser Phe Arg Pro Glu Leu Gly Ser Arg Leu Val Val 
50 55 60 

His Ser Gly Pro Asp Asp Gly Pro Tyr Asp Ala Met Asn Arg Gly Val 
65 70 75 80 

Gly Val Ala Thr Gly Glu Trp Val Leu Phe Leu Gly Ala Asp Asp Thr 
85 90 95 

Leu Tyr Glu Pro Thr Thr Leu Ala Gin Val Ala Ala Phe Leu Gly Asp 
100 105 110 

His Ala Ala Ser His Leu Val Tyr Gly Asp Val Val Met Arg Ser Thr 

115 120 125 

Lys Ser Arg His Ala Gly Pro Phe Asp Leu Asp Arg Leu Leu Phe Glu 
130 135 140 

Thr Asn Leu Cys His Gin Ser lie Phe Tyr Arg Arg Glu Leu Phe Asp 
145 150 155 160 

Gly lie Gly Pro Tyr Asn Leu Arg Tyr Arg Val Trp Ala Asp Trp Asp 
165 170 175 

Phe Asn lie Arg Cys Phe Ser Asn Pro Ala Leu lie Thr Arg Tyr Met 
180 185 190 

Asp Val Val lie Ser Glu Tyr Asn Asp Met Thr Gly Phe Ser Met Arg 
195 200 205 

Gin Gly Thr Asp Lys Glu Phe Arg Lys Arg Leu Pro Met Tyr Phe Trp 
210 215 220 

Val Ala Gly Trp Glu Thr Cys Arg Arg Met Leu Ala Phe Leu Lys Asp 
225 230 235 240 

Lys Glu Asn Arg Arg Leu Ala Leu Arg Thr Arg Leu lie Arg Val Lys 
245 250 255 



Ala Val Ser Lys Glu Arg Ser Ala Glu Pro 
260 265 



<210> 23 
<211> 801 
<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> (1) . . {798) 



31 



<400> 23 

atg act gcg cca gtg ttc teg ata att ate cct acc ttc aat gca gcg 48 

Met Thr Ala Pro Val Phe Ser lie lie He Pro Thr Phe Asn Ala Ala 

15 10 15 

gtg acg ctg caa gec tgc etc gga age ate gtc ggg cag acc tac egg 96 

Val Thr Leu Gin Ala Cys Leu Gly Ser He Val Gly Gin Thr Tyr Arg 

20 25 30 

gaa gtg gaa gtg gtc ctt gtc gac ggc ggt teg acc gat egg acc etc 14 4 

Glu Val Glu Val Val Leu Val Asp Gly Gly Ser Thr Asp Arg Thr Leu 

35 40 45 

gac ate gcg aac agt ttc cgc ccg gaa etc ggc teg cga ctg gtc gtt 192 

Asp He Ala Asn Ser Phe Arg Pro Glu Leu Gly Ser Arg Leu Val Val 

50 55 60 

cac age ggg ccc gat gat ggc ccc tac gac gee atg aac cgc ggc gtc 24 0 

His Ser Gly Pro Asp Asp Gly Pro Tyr Asp Ala Met Asn Arg Gly Val 

65 70 75 80 

ggc gta gee aca ggc gaa tgg gta ctt ttt tta ggc gec gac gac acc 288 

Gly Val Ala Thr Gly Glu Trp Val Leu Phe Leu Gly Ala Asp Asp Thr 

85 90 95 

etc tac gaa cca acc acg ttg gee cag gta gec get ttt etc ggc gac 336 

Leu Tyr Glu Pro Thr Thr Leu Ala Gin Val Ala Ala Phe Leu Gly Asp 

100 105 110 

cat gcg gca age cat ctt gtc tat ggc gat gtt gtg atg cgt teg acg 384 

His Ala Ala Ser His Leu Val Tyr Gly Asp Val Val Met Arg Ser Thr 

115 120 125 

aaa age egg cat gec gga cct ttc gac etc gac cgc etc eta ttt gag 432 

Lys Ser Arg His Ala Gly Pro Phe Asp Leu Asp Arg Leu Leu Phe Glu 

130 135 140 

acg aat ttg tgc cac caa teg ate ttt tac cgc cgt gag ctt ttc gac 480 

Thr Asn Leu Cys His Gin Ser lie Phe Tyr Arg Arg Glu Leu Phe Asp 

145 150 155 160 

ggc ate ggc cct tac aac ctg cgc tac cga gtc tgg gcg gac tgg gac 528 

Gly He Gly Pro Tyr Asn Leu Arg Tyr Arg Val Trp Ala Asp Trp Asp 

165 170 175 

ttc aat att cgc tgc ttc tec aac ccg gcg ctg att acc cgc tac atg 576 

Phe Asn He Arg Cys Phe Ser Asn Pro Ala Leu He Thr Arg Tyr Met 

180 185 190 

gac gtc gtg att tec gaa tac aac gac atg acc ggc ttc age atg agg 624 

Asp Val Val He Ser Glu Tyr Asn Asp Met Thr Gly Phe Ser Met Arg 

195 200 205 

cag ggg act gat aaa gag ttc aga aaa egg ctg cca atg tac ttc tgg 672 

Gin Gly Thr Asp Lys Glu Phe Arg Lys Arg Leu Pro Met Tyr Phe Trp 

210 215 220 

gtt gca ggg tgg gag act tgc agg cgc atg ctg gcg ttt ttg aaa gac 720 

Val Ala Gly Trp Glu Thr Cys Arg Arg Met Leu Ala Phe Leu Lys Asp 

225 230 235 240 
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aag gag aat cgc cgt ctg gcc ttg cgt acg egg ttg ata agg gtt aag 768 

Lys Glu Asn Arg Arg Leu Ala Leu Arg Thr Arg Leu He Arg Val Lys 

245 250 255 

gcc gtc tec aaa gaa cga age gca gaa ccg tag 801 

Ala Val Ser Lys Glu Arg Ser Ala Glu Pro 

260 265 



<210> 24 
<211> 266 
<212> PRT 

<213> Mycobacterium 
<400> 24 

Met Thr Ala Pro Val Phe Ser He He He Pro Thr Phe Asn Ala Ala 
15 10 15 

Val Thr Leu Gin Ala Cys Leu Gly Ser He Val Gly Gin Thr Tyr Arg 
20 25 30 

Glu Val Glu Val Val Leu Val Asp Gly Gly Ser Thr Asp Arg Thr Leu 
35 40 45 

Asp He Ala Asn Ser Phe Arg Pro Glu Leu Gly Ser Arg Leu Val Val 
50 55 60 

His Ser Gly Pro Asp Asp Gly Pro Tyr Asp Ala Met Asn Arg Gly Val 
65 70 75 80 

Gly Val Ala Thr Gly Glu Trp Val Leu Phe Leu Gly Ala Asp Asp Thr 
85 90 95 

Leu Tyr Glu Pro Thr Thr Leu Ala Gin Val Ala Ala Phe Leu Gly Asp 
100 105 110 

His Ala Ala Ser His Leu Val Tyr Gly Asp Val Val Met Arg Ser Thr 
115 120 125 

Lys Ser Arg His Ala Gly Pro Phe Asp Leu Asp Arg Leu Leu Phe Glu 
130 135 140 

Thr Asn Leu Cys His Gin Ser He Phe Tyr Arg Arg Glu Leu Phe Asp 
145 150 155 160 

Gly lie Gly Pro Tyr Asn Leu Arg Tyr Arg Val Trp Ala Asp Trp Asp 
165 170 175 

Phe Asn lie Arg Cys Phe Ser Asn Pro Ala Leu He Thr Arg Tyr Met 
180 185 190 

Asp Val Val lie Ser Glu Tyr Asn Asp Met Thr Gly Phe Ser Met Arg 
195 200 205 

Gin Gly Thr Asp Lys Glu Phe Arg Lys Arg Leu Pro Met Tyr Phe Trp 
210 215 220 



Val Ala Gly Trp Glu Thr Cys Arg Arg Met Leu Ala Phe Leu Lys Asp 
225 230 235 240 



33 

Lys Glu Asn Arg Arg Leu Ala Leu Arg Thr Arg Leu He Arg Val Lys 
245 250 255 

Ala Val Ser Lys Glu Arg Ser Ala Glu Pro 
260 265 



<210> 25 
<211> 867 
<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> (1) . . {864) 

<400> 25 

gtg gcc age aga agt ccc cac tec get gcg ggt ggt tgg eta att ctt 48 
Val Ala Ser Arg Ser Pro His Ser Ala Ala Gly Gly Trp Leu He Leu 
15 10 15 

ggc ggc tec ctt ctt gtg gtc ggc gtg gcg cat ccg gta gga etc gcc 96 
Gly Gly Ser Leu Leu Val Val Gly Val Ala His Pro Val Gly Leu Ala 
20 25 30 

gga ggt gac gac gat get ggc gtg gtg cag cag ccg ate gag gat get 14 4 
Gly Gly Asp Asp Asp Ala Gly Val Val Gin Gin Pro He Glu Asp Ala 
35 40 45 

ggc ggc ggt ggt gtg etc ggg cag gaa teg ccc cca ttg ttc gaa ggg 192 
Gly Gly Gly Gly Val Leu Gly Gin Glu Ser Pro Pro Leu Phe Glu Gly 
50 55 60 

cca atg cga ggc gat ggc cag gga gcg gcg etc gta gcc ggc age cac 240 
Pro Met Arg Gly Asp Gly Gin Gly Ala Ala Leu Val Ala Gly Ser His 
65 70 75 80 

gag ccg gaa caa cag ttg agt ccc ggt gtc gtc gag egg ggc aaa gcc 288 
Glu Pro Glu Gin Gin Leu Ser Pro Gly Val Val Glu Arg Gly Glu Ala 
85 90 95 

gat etc gtc caa gat gac cag ate cgc gcg gag cag ggt gtc gat gat 336 
Asp Leu Val Gin Asp Asp Gin He Arg Ala Glu Gin Gly Val Asp Asp 
100 105 no 



ctt gcc gac ggt gtt gtc ggc cag gcc gcg gta gag gac etc gat cag 
Leu Ala Asp Gly Val Val Gly Gin Ala Ala Val Glu Asp Leu Asp Gin 
115 120 125 



gtg ccc gca gcc gat gag cag gtg act ttt gcc cgt acc agg tgg gcc 
Val Pro Ala Ala Asp Glu Gin Val Thr Phe Ala Arg Thr Arg Trp Ala 
145 150 155 160 



384 



gtc ggc ggc ggt gaa gta gcg gac ttt gaa tec ggc gtg gac ggc age 432 
Val Gly Gly Gly Glu Val Ala Asp Phe Glu Ser Gly Val Asp Gly Ser 
130 135 140 



480 



aat gac cgc cag gtt ctg ttg tgc ccg aat cca ttc cag get cga cag 528 
Asn Asp Arg Gin Val Leu Leu Cys Pro Asn Pro Phe Gin Ala Arg Gin 
165 170 175 



34 

gta gtc gaa cgt ggc tgc ggt gat cga cga tec ggt gac gtc gaa ccc 57 6 
Val Val Glu Arg Gly Cys Gly Asp Arg Arg Ser Gly Asp Val Glu Pro 
180 185 190 

gtc gag ggt ctt ggt gac egg gaa ggc tgc ggc ctt gag acg gtt ggc 624 

Val Glu Gly Leu Gly Asp Arg Glu Gly Cys Gly Leu Glu Thr Val Gly 

195 200 205 

ggt gtt gga ggc ate gcg ggc age gat etc ggc etc aac caa cgt ccg 672 

Gly Val Gly Gly lie Ala Gly Ser Asp Leu Gly Leu Asn Gin Arg Pro 
210 215 220 



cag gat etc etc egg zgt cca gcg ttg cgt ctt ggc gac ttg caa cac 
Gin Asp Leu Leu Arg Cys Pro Ala Leu Arg Leu Gly Asp Leu Gin His 
225 230 235 240 



ctt ggc age ggt ggt cat gag gec gtc ccg teg gtg gtg ttg ate ttg 
Leu Gly Ser Gly Gly His Glu Ala Val Pro Ser Val Val Leu He Leu 
275 280 285 



tag 



<210> 26 
<211> 288 
<212> PRT 

<213> Mycobacterium 



<400> 26 

Val Ala Ser Arg Ser Pro His Ser Ala Ala Gly Gly Trp Leu lie Leu 
15 10 15 

Gly Gly Ser Leu Leu Val Val Gly Val Ala His Pro Val Gly Leu Ala 
20 25 30 

Gly Gly Asp Asp Asp Ala Gly Val Val Gin Gin Pro He Glu Asp Ala 
35 40 45 

Gly Gly Gly Gly Val Leu Gly Gin Glu Ser Pro Pro Leu Phe Glu Gly 
5 0 55 60 

Pro Met Arg Gly Asp Gly Gin Gly Ala Ala Leu Val Ala Gly Ser His 
65 ? 0 75 80 

Glu Pro Glu Gin Gin Leu Ser Pro Gly Val Val Glu Arg Gly Glu Ala 
85 90 95 

Asp Leu Val Gin Asp Asp Gin He Arg Ala Glu Gin Gly Val Asp Asp 

ioo 105 no 

Leu Ala Asp Gly Val Val Gly Gin Ala Ala Val Glu Asp Leu Asp Gin 
H5 120 125 



720 



etc ggc ggc gtt gcg gcg cac cgt ggc cag ctt caa ccg ccg cag cgc 768 

Leu Gly Gly Val Ala Ala His Arg Gly Gin Leu Gin Pro Pro Gin Arg 
245 250 255 

cgc gtc aag gtc age age cag egg tgc cgc cga gga egg tgc cac egg 816 

Arg Val Lys Val Ser Ser Gin Arg Cys Arg Arg Gly Arg Cys His Arg 

260 265 270 



864 



867 
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Val Gly Gly Gly Glu Val Ala Asp Phe Glu Ser Gly Val Asp Gly Ser 
130 135 140 

Val Pro Ala Ala Asp Glu Gin Val Thr Phe Ala Arg Thr Arg Trp Ala 
145 150 155 160 

Asn Asp Arg Gin Val Leu Leu Cys Pro Asn Pro Phe Gin Ala Arg Gin 
165 170 175 

Val Val Glu Arg Gly Cys Gly Asp Arg Arg Ser Gly Asp Val Glu Pro 
180 185 190 

Val Glu Gly Leu Gly Asp Arg Glu Gly Cys Gly Leu Glu Thr Val Gly 
195 200 205 

Gly Val Gly Gly lie Ala Gly Ser Asp Leu Gly Leu Asn Gin Arg Pro 
210 215 220 

Gin Asp Leu Leu Arg Cys Pro Ala Leu Arg Leu Gly Asp Leu Gin His 
225 230 235 240 

Leu Gly Gly Val Ala Ala His Arg Gly Gin Leu Gin Pro Pro Gin Arg 
245 250 255 

Arg Val Lys Val Ser Ser Gin Arg Cys Arg Arg Gly Arg Cys His Arg 
260 265 270 

Leu Gly Ser Gly Gly His Glu Ala Val Pro Ser Val Val Leu He Leu 
275 280 285 



<210> 27 

<211> 1739 

<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> (1) . . (945) 

<400> 27 

atg ggc tgc etc aaa ggt ggt gtc gtc 
Met Gly Cys Leu Lys Gly Gly Val Val 
1 5 

ccg gat tat gtg cga ttc gcg tec cac 
Pro Asp Tyr Val Arg Phe Ala Ser His 
20 25 

tgc cac ggt gcg gat ccg caa teg aag 
Cys His Gly Ala Asp Pro Gin Ser Lys 
35 40 

ggc tac get cag gac gac ctt gcg gtg 
Gly Tyr Ala Gin Asp Asp Leu Ala Val 
50 55 



gee aat gtt gtt gtt cca aca 48 
Ala Asn Val Val Val Pro Thr 
10 15 

tat ggc ttc gtt ccg gac ttc 96 
Tyr Gly Phe Val Pro Asp Phe 
30 

ggc ate gtg gag aac etc tgt 144 
Gly He Val Glu Asn Leu Cys 
45 

ccg ctg ctg ace gaa get gcg 192 
Pro Leu Leu Thr Glu Ala Ala 
60 



tta gec ggt gag cag gtc gac eta cgt gee etc aac gec cag gcg caa 
Leu Ala Gly Glu Gin Val Asp Leu Arg Ala Leu Asn Ala Gin Ala Gin 
65 70 75 80 



240 



36 



eta tgg tgc gec gag gtc aat gec acg gtc cac teg gag ate tgc gee 288 

Leu Trp Cys Ala Glu Val Asn Ala Thr Val His Ser Glu lie Cys Ala 
85 90 95 

gtg ccc aac gat cgc ttg gtt gac gag cgc acc gtc ttg agg gag ctg 336 

Val Pro Asn Asp Arg Leu Val Asp Glu Arg Thr Val Leu Arg Glu Leu 
100 105 110 

ccc teg ctg egg ccg acg ate ggc teg ggg teg gtg cgc cgt aag gtc 384 

Pro Ser Leu Arg Pro Thr lie Gly Ser Gly Ser Val Arg Arg Lys Val 

115 120 125 

gac ggc etc teg tgc ate cgt rac ggc tea get cgt tac teg gtg cct 432 

Asp Gly Leu Ser Cys lie Arg Tyr Gly Ser Ala Arg Tyr Ser Val Pro 

130 135 140 

cag egg etc gtc ggt gec acc gtg gcg gtg gtg gtc gat cat ggc gec 480 

Gin Arg Leu Val Gly Ala Thr Val Ala Val Val Val Asp His Gly Ala 

145 150 155 160 

ctg ate ctg ttg gaa cct gcg acc ggt gtg ate gtg gec gag cac gag 528 

Leu lie Leu Leu Glu Pro Ala Thr Gly Val lie Val Ala Glu His Glu 
165 170 175 

etc gtc age cca ggt gag gtg tec ate etc gat gaa cac tac gac gga 576 

Leu Val Ser Pro Gly Glu Val Ser He Leu Asp Glu His Tyr Asp Gly 
180 185 190 

ccc aga ccc gca ccc teg cgt ggt cct cgc ccg aaa acc caa gca gag 624 

Pro Arg Pro Ala Pro Ser Arg Gly Pro Arg Pro Lys Thr Gin Ala Glu 

195 200 205 



aaa cga ttc tgc gca ttg gga acc gaa gcg cag cag ttc etc gtc ggt 
Lys Arg Phe Cys Ala Leu Gly Thr Glu Ala Gin Gin Phe Leu Val Gly 



210 215 220 



672 



get get gcg ate ggc aac acc cga ctg aaa tec gaa etc gac att ctg 720 

Ala Ala Ala He Gly Asn Thr Arg Leu Lys Ser Glu Leu Asp He Leu 

225 230 235 240 

etc ggc ctt ggc gee gec cac ggc gaa cag get ttg att gac gcg ctg 768 

Leu Gly Leu Gly Ala Ala His Gly Glu Gin Ala Leu He Asp Ala Leu 

245 250 255 

cgc egg gcg gtt gcg ttt cgc egg ttc cgc get gec gac gtg cgc teg 816 

Arg Arg Ala Val Ala Phe Arg Arg Phe Arg Ala Ala Asp Val Arg Ser 

260 265 270 

ate ctg gec gec ggc gee ggc acc cca caa ccc cgc ccc gec ggc gac 864 

He Leu Ala Ala Gly Ala Gly Thr Pro Gin Pro Arg Pro Ala Gly Asp 

275 280 285 

gca etc gtg etc gat ctg ccc acc gtc gag acc cgc teg ttg gag gee 912 

Ala Leu Val Leu Asp Leu Pro Thr Val Glu Thr Arg Ser Leu Glu Ala 

290 295 300 

tac aag ate aac acc acc gac ggg acg gee tea tgaccaccgc tgccaagccg 965 

Tyr Lys He Asn Thr Thr Asp Gly Thr Ala Ser 

305 310 315 



gtggcaccgt cctcggcggc accgctggct gctgaccttg acgcggcgct gcggcggttg 1025 
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aagctggcca 


cggtgcgccg 


caacgccgcc 


gaggtgttgc 


aagtcgccaa 


gacgcaacgc 


1085 


tggacaccgg 


aggagatcct 


gcggacgttg 


gttgaggccg 


agatcgctgc 


ccgcgatgcc 


1145 


tccaacaccg 


ccaaccgtct 


caaggccgca 


gccttcccgg 


tcaccaagac 


cctcgacggg 


1205 


ttcgacgtca 


ccggatcgtc 


gatcaccgca 


gccacgttcg 


actacctgtc 


gagcctggaa 


1265 


tggattcggg 


cacaacagaa 


cctggcggtc 


attggcccac 


ctggtacggg 


caaaagtcac 


1325 


ctgctcatcg 


gctgcgggca 


cgctgccgtc 


cacgccggat 


tcaaagtccg 


ctacttcacc 


1385 


gccgccgacc 


tgatcgaggt 


cctctaccgc 


ggcctggccg 


acaacaccgt 


cggcaagatc 


1445 


atcgacaccc 


tgctccgcgc 


ggatctggtc 


atcttggacg 


agatcggctt 


cgccccgctc 


1505 


gacgacaccg 


ggactcaact 


gttgttccgg 


ctcgtggctg 


ccggctacga 


gcgccgctcc 


1565 


ctggccatcg 


cctcgcattg 


gcccttcgaa 


caatgggggc 


gattcctgcc 


cgagcacacc 


1625 


accgccgcca 


gcatcctcga 


tcggctgctg 


caccacgcca 


gcatcgtcgt 


cacctccggc 


1685 


gagtcctacc 


ggatgcgcca 


cgccgaccac 


aagaagggag 


ccgccaagaa 


ttag 


1739 



<210> 28 
<211> 315 
<212> PRT 

<213> Mycobacterium 
<400> 28 

Met Gly Cys Leu Lys Gly Gly Val Val Ala Asn Val Val Val 
15 10 

Pro Asp Tyr Val Arg Phe Ala Ser His Tyr Gly Phe Val Pro 
20 25 30 

Cys His Gly Ala Asp Pro Gin Ser Lys Gly He Val Glu Asn 
35 40 45 

Gly Tyr Ala Gin Asp Asp Leu Ala Val Pro Leu Leu Thr Glu 
50 55 60 

Leu Ala Gly Glu Gin Val Asp Leu Arg Ala Leu Asn Ala Gin 
65 70 75 

Leu Trp Cys Ala Glu Val Asn Ala Thr Val His Ser Glu He 
85 90 

Val Pro Asn Asp Arg Leu Val Asp Glu Arg Thr Val Leu Arg 
100 105 110 

Pro Ser Leu Arg Pro Thr He Gly Ser Gly Ser Val Arg Arg 
115 120 125 

Asp Gly Leu Ser Cys He Arg Tyr Gly Ser Ala Arg Tyr Ser 
130 135 140 

Gin Arg Leu Val Gly Ala Thr Val Ala Val Val Val Asp His 
145 150 155 



Pro Thr 
15 

Asp Phe 

Leu Cys 

Ala Ala 

Ala Gin 
80 

Cys Ala 
95 

Glu Leu 

Lys Val 

Val Pro 

Gly Ala 
160 



38 



Leu He Leu Leu Glu Pro Ala Thr Gly Val He Val Ala Glu His Glu 
165 170 175 

Leu Val Ser Pro Gly Glu Val Ser He Leu Asp Glu His Tyr Asp Gly 
180 185 190 

Pro Arg Pro Ala Pro Ser Arg Gly Pro Arg Pro Lys Thr Gin Ala Glu 
195 200 205 

Lys Arg Phe Cys Ala Leu Gly Thr Glu Ala Gin Gin Phe Leu Val Gly 
210 215 220 

Ala Ala Ala He Gly Asn Thr Arg Leu Lys Ser Glu Leu Asp He Leu 
225 230 235 240 

Leu Gly Leu Gly Ala Ala His Gly Glu Gin Ala Leu lie Asp Ala Leu 
245 250 255 

Arg Arg Ala Val Ala Phe Arg Arg Phe Arg Ala Ala Asp Val Arg Ser 
260 265 270 

He Leu Ala Ala Gly Ala Gly Thr Pro Gin Pro Arg Pro Ala Gly Asp 
275 280 285 

Ala Leu Val Leu Asp Leu Pro Thr Val Glu Thr Arg Ser Leu Glu Ala 
290 295 300 



Tyr Lys lie Asn Thr Thr Asp Gly Thr Ala Ser 
305 310 315 



<210> 29 
<211> 264 
<212> PRT 

<213> Mycobacterium 
<220> 

<221> DOMAIN 
<222> (1) . . (264) 

<223> amino acid sequence is encoded by nucleotides 
945-1736 of SEQ ID NO:27 

<400> 29 

Met Thr Thr Ala Ala Lys Pro Val Ala Pro Ser Ser Ala Ala Pro Leu 
15 10 15 

Ala Ala Asp Leu Asp Ala Ala Leu Arg Arg Leu Lys Leu Ala Thr Val 
20 25 30 

Arg Arg Asn Ala Ala Glu Val Leu Gin Val Ala Lys Thr Gin Arg Trp 
35 40 45 

Thr Pro Glu Glu lie Leu Arg Thr Leu Val Glu Ala Glu lie Ala Ala 
50 55 60 

Arg Asp Ala Ser Asn Thr Ala Asn Arg Leu Lys Ala Ala Ala Phe Pro 
65 70 75 80 

Val Thr Lys Thr Leu Asp Gly Phe Asp Val Thr Gly Ser Ser lie Thr 
85 90 95 



39 



Ala Ala Thr Phe 
100 



Asp Tyr Leu Ser 



Ser Leu Glu Trp lie 
105 



Arg Ala Gin 
110 



Gin Asn Leu Ala Val lie Gly Pro 
115 120 



Pro Gly Thr Gly Lys 
125 



Ser His Leu 



Leu lie Gly Cys Gly His Ala Ala 
130 135 



Val His Ala Gly Phe 

140 



Lys Val Arg 



Tyr Phe Thr Ala Ala Asp 
145 150 



Leu lie Glu Val Leu Tyr Arg Gly Leu Ala 
155 160 



Asp Asn Thr Val 



Gly Lys He He Asp Thr Leu Leu Arg Ala Asp Leu 
165 170 175 



Val He Leu Asp 
180 



Glu He Gly Phe 



Ala Pro Leu Asp Asp 
185 



Thr Gly Thr 
190 



Gin Leu Leu Phe Arg Leu Val Ala 

195 200 

Ala lie Ala Ser His Trp Pro Phe 

210 215 



Ala Gly Tyr Glu Arg 
205 

Glu Gin Trp Gly Arg 

220 



Arg Ser Leu 



Phe Leu Pro 



Glu His Thr Thr Ala Ala 
225 230 



Ser He Leu Asp Arg Leu Leu His His Ala 
235 240 



Ser He Val Val 



Thr Ser 
245 



His Lys Lys Gly Ala Ala 
260 



Gly Glu 
Lys Asn 



Ser Tyr Arg Met Arg 
250 



His Ala Asp 
255 



<210> 30 
<211> 789 
<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> (1) . . (786) 

<400> 30 

gtg acg tct get ccg acc gtc teg gtg ata acg ate teg ttc aac gac 48 

Met Thr Ser Ala Pro Thr Val Ser Val lie Thr He Ser Phe Asn Asp 
15 10 15 

etc gac ggg ttg cag cgc acg gtg aaa agt gtg egg gcg caa cgc tac 96 
Leu Asp Gly Leu Gin Arg Thr Val Lys Ser Val Arg Ala Gin Arg Tyr 
20 25 30 

egg gga cgc ate gag cac ate gta ate gac ggt ggc age ggc gac gac 144 
Arg Gly Arg He Glu His He Val He Asp Gly Gly Ser Gly Asp Asp 
35 40 45 

gtg gtg gca tac ctg tec ggg tgt gaa cca ggc ttc gcg tat tgg cag 192 
Val Val Ala Tyr Leu Ser Gly Cys Glu Pro Gly Phe Ala Tyr Trp Gin 
50 55 60 
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tec gag ccc gac ggc ggg egg tac gac gcg atg aac cag ggc ate gcg 240 
Ser Glu Pro Asp Gly Gly Arg Tyr Asp Ala Met Asn Gin Gly lie Ala 

65 70 75 80 

cac gca teg ggt gat ctg ttg tgg ttc ttg cac tec gec gat cgt ttt 288 
His Ala Ser Gly Asp Leu Leu Trp Phe Leu His Ser Ala Asp Arg Phe 
85 90 95 

tec ggg ccc gac gtg gta gee cag gec gtg gag gcg eta tec ggc aag 336 

Ser Gly Pro Asp Val Vai Ala Gin Ala Val Glu Ala Leu Ser Gly Lys 
100 105 110 

gga ccg gtg tec gaa ttg tgg ggc ttc ggg atg gat cgt etc gtc ggg 384 
Gly Pro Val Ser Glu Leu Trp Gly Phe Gly Met Asp Arg Leu Val Gly 
115 120 125 

etc gat egg gtg cgc ggc ccg ata cct ttc age ctg cgc aaa ttc ctg 432 

Leu Asp Arg Val Arg Gly Pro lie Pro Phe Ser Leu Arg Lys Phe Leu 
130 135 140 

gec ggc aag cag gtt gtt ccg cat caa gca teg ttc ttc gga tea teg 480 

Ala Gly Lys Gin Val Val Pro His Gin Ala Ser Phe Phe Gly Ser Ser 

145 150 155 160 

ctg gtg gee aag ate ggt ggc tac gac ctt gat ttc ggg ate gee gee 528 

Leu Val Ala Lys He Gly Gly Tyr Asp Leu Asp Phe Gly He Ala Ala 
165 170 175 

gac cag gaa ttc ata ttg egg gee gcg ctg gta tgc gag ccg gtc acg 576 

Asp Gin Glu Phe He Leu Arg Ala Ala Leu Val Cys Glu Pro Val Thr 
180 185 190 

att egg tgt gtg ctg tgc gag ttc gac acc acg ggc gtc ggc teg cac 624 

He Arg Cys Val Leu Cys Glu Phe Asp Thr Thr Gly Val Gly Ser His 
195 200 205 

egg gaa cca age gcg gzc ttc ggt gat czq cgc cgc atg ggc gac ctt 672 

Arg Glu Pro Ser Ala Val Phe Gly Asp Leu Arg Arg Met Gly Asp Leu 
210 215 220 ** 

cat cgc cgc tac ccg ttc ggg gga agg cga ata tea cat gee tac eta 720 

His Arg Arg Tyr Pro Phe Gly Gly Arg Arg He Ser His Ala Tyr Leu 

225 230 235 240 

cgc ggc egg gag ttc tac gec tac aac agt cga ttc tgg gaa aac gtc 768 

Arg Gly Arg Glu Phe Tyr Ala Tyr Asn Ser Arg Phe Trp Glu Asn Val 
245 250 255 

ttc acg cga atg teg aaa tag 789 

Phe Thr Arg Met Ser Lys 
260 



<210> 31 
<211> 262 
<212> PRT 

<213> Mycobacterium 



<400> 31 

Met Thr Ser Ala Pro Thr Val Ser Val lie Thr He Ser Phe Asn Asp 
15 10 15 
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Leu Asp Gly Leu Gin Arg Thr Val Lys Ser Val Arg Ala Gin Arg Tyr 
20 25 30 

Arg Gly Arg lie Glu His He Val He Asp Gly Gly Ser Gly Asp Asp 
35 40 45 

Val Val Ala Tyr Leu Ser Gly Cys Glu Pro Gly Phe Ala Tyr Trp Gin 
50 55 60 

Ser Glu Pro Asp Gly Gly Arg Tyr Asp Ala Met Asn Gin Gly He Ala 
65 70 75 80 

His Ala Ser Gly Asp Leu Leu Trp Phe Leu His Ser Ala Asp Arg Phe 
85 90 95 

Ser Gly Pro Asp Val Val Ala Gin Ala Val Glu Ala Leu Ser Gly Lys 
100 105 . 110 

Gly Pro Val Ser Glu Leu Trp Gly Phe Gly Met Asp Arg Leu Val Gly 
115 120 125 

Leu Asp Arg Val Arg Gly Pro He Pro Phe Ser Leu Arg Lys Phe Leu 
130 135 140 

Ala Gly Lys Gin Val Val Pro His Gin Ala Ser Phe Phe Gly Ser Ser 
145 150 155 160 

Leu Val Ala Lys He Gly Gly Tyr Asp Leu Asp Phe Gly He Ala Ala 
165 170 175 

Asp Gin Glu Phe He Leu Arg Ala Ala Leu Val Cys Glu Pro Val Thr 
180 185 190 

lie Arg Cys Val Leu Cys Glu Phe Asp Thr Thr Gly Val Gly Ser His 
195 200 205 

Arg Glu Pro Ser Ala Val Phe Gly Asp Leu Arg Arg Met Gly Asp Leu 
210 215 220 

His Arg Arg Tyr Pro Phe Gly Gly Arg Arg He Ser His Ala Tyr Leu 
225 230 235 240 

Arg Gly Arg Glu Phe Tyr Ala Tyr Asn Ser Arg Phe Trp Glu Asn Val 
245 250 255 



Phe Thr Arg Met Ser Lys 
260 



<210> 32 
<211> 1023 
<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> (1) . . (1020) 
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<400> 32 

gtg aag cga gcg etc ate acc gga ate ace ggc cag gac ggc teg tat 48 

Met Lys Arg Ala Leu lie Thr Gly lie Thr Gly Gin Asp Gly Ser Tyr 
15 10 15 

etc gec gaa ctg ctg ctg gec aag ggg tat gag gtt cac ggg etc ate 96 

Leu Ala Glu Leu Leu Leu Ala Lys Gly Tyr Glu Val His Gly Leu lie 
20 25 30 

egg cgc get teg acg ttc aac acc teg egg ate gat cac etc tac gtc 144 

Arg Arg Ala Ser Thr Phe Asn Thr Ser Arg lie Asp His Leu Tyr Val 
35 40 45 

gac ccg cac caa ccg ggc gcg egg ctg ttt ctg cac tat ggt gac ctg 192 

Asp Pro His Gin Pro Gly Ala Arg Leu Phe Leu His Tyr Gly Asp Leu 
50 55 60 

ate gac gga acc egg ttg gtg acc ctg ctg age acc ate gaa ccc gac 240 

lie Asp Gly Thr Arg Leu Val Thr Leu Leu Ser Thr lie Glu Pro Asp 
65 70 75 80 

gag gtg tac aac ctg gcg gcg cag tea cac gtg egg gtg age ttc gac 288 

Glu Val Tyr Asn Leu Ala Ala Gin Ser His Val Arg Val Ser Phe Asp 
85 90 95 

gaa ccc gtg cac acc ggt gac acc acc ggc atg gga tec atg cga ctg 336 

Glu Pro Val His Thr Gly Asp Thr Thr Gly Met Gly Ser Met Arg Leu 
100 105 110 

ctg gaa gec gtt egg etc tct egg gtg cac tgc cgc ttc tat cag gcg 384 

Leu Glu Ala Val Arg Leu Ser Arg Val His Cys Arg Phe Tyr Gin Ala 
115 120 125 

tec teg teg gag atg ttc ggc gec teg ccg cca ccg cag aac gag ctg 432 

Ser Ser Ser Glu Met Phe Gly Ala Ser Pro Pro Pro Gin Asn Glu Leu 

130 135 140 

acg ccg ttc tac ccg egg tea ccg tat ggc gec gec aag gtc tat teg 480 

Thr Pro Phe Tyr Pro Arg Ser Pro Tyr Gly Ala Ala Lys Val Tyr Ser 
145 150 155 160 

tac tgg gcg acc cgc aat tat cgc gaa gcg tac gga ttg ttc gec gtt 528 

Tyr Trp Ala Thr Arg Asn Tyr Arg Glu Ala Tyr Gly Leu Phe Ala Val 
165 170 175 

aac ggc ate ttg ttc aat cac gaa tea ccg egg cgc ggt gag acg ttc 576 

Asn Gly lie Leu Phe Asn His Glu Ser Pro Arg Arg Gly Glu Thr Phe 
180 185 190 

gtg acc cga aag ate acc agg gec gtg gca cgc ate aag gec ggt ate 624 

Val Thr Arg Lys lie Thr Arg Ala Val Ala Arg He Lys Ala Gly He 
195 200 205 

cag tec gag gtc tat atg ggc aat ctg gat gcg gtc cgc gac tgg ggg 672 

Gin Ser Glu Val Tyr Met Gly Asn Leu Asp Ala Val Arg Asp Trp Gly 

210 215 220 

tac gcg ccc gaa tac gtc gaa ggc atg tgg egg atg ctg cag acc gac 720 

Tyr Ala Pro Glu Tyr Val Glu Gly Met Trp Arg Met Leu Gin Thr Asp 
225 230 235 240 
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gag ccc gac gac ttc gtt ttg gcg acc ggg cgc ggt ttc acc gtg cgt 7 68 

Glu Pro Asp Asp Phe Val Leu Ala Thr Gly Arg Gly Phe Thr Val Arg 

245 250 255 

gag ttc gcg egg gec gcg ttc gag cat gec ggt ttg gac tgg cag cag 816 

Glu Phe Ala Arg Ala Ala Phe Glu His Ala Gly Leu Asp Trp Gin Gin 

260 265 270 

tac gtg aaa ttc, gac caa cgc tat ctg egg ccc acc gag gtg gat teg 8 64 

Tyr Val Lys Phe Asp Gin Arg Tyr Leu Arg Pro Thr Glu Val Asp Ser 

275 280 285 

ctg ate ggc gac gcg acc aag get gec gaa ttg ctg ggc tgg agg get 912 

Leu lie Gly Asp Ala Thr Lys Ala Ala Glu Leu Leu Gly Trp Arg Ala 

290 295 300 

teg gtg cac act gac gag ttg get egg ate atg gtc gac gcg gac atg 960 

Ser Val His Thr Asp Glu Leu Ala Arg lie Met Val Asp Ala Asp Met 

305 310 315 320 

gcg gcg ctg gag tgc gaa ggc aag ccg tgg ate gac aag ccg atg ate 1008 

Ala Ala Leu Glu Cys Glu Gly Lys Pro Trp lie Asp Lys Pro Met lie 

325 330 335 

gec ggc egg aca tga 1023 
Ala Gly Arg Thr 
340 



<210> 33 
<211> 340 
<212> PRT 

<213> Mycobacterium 
<400> 33 

Met Lys Arg Ala Leu He Thr Gly He Thr Gly Gin Asp Gly Ser Tyr 
15 10 15 

Leu Ala Glu Leu Leu Leu Ala Lys Gly Tyr Glu Val His Gly Leu lie 
20 25 30 

Arg Arg Ala Ser Thr Phe Asn Thr Ser Arg He Asp His Leu Tyr Val 
35 40 45 

Asp Pro His Gin Pro Gly Ala Arg Leu Phe Leu His Tyr Gly Asp Leu 
50 55 60 

He Asp Gly Thr Arg Leu Val Thr Leu Leu Ser Thr lie Glu Pro Asp 
65 70 75 80 

Glu Val Tyr Asn Leu Ala Ala Gin Ser His Val Arg Val Ser Phe Asp 
85 90 95 

Glu Pro Val His Thr Gly Asp Thr Thr Gly Met Gly Ser Met Arg Leu 
100 105 110 

Leu Glu Ala Val Arg Leu Ser Arg Val His Cys Arg Phe Tyr Gin Ala 
115 120 125 



Ser Ser Ser Glu Met Phe Gly Ala Ser Pro Pro Pro Gin Asn Glu Leu 
130 135 140 
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Thr Pro Phe Tyr Pro Arg Ser Pro Tyr Gly Ala Ala Lys Val Tyr Ser 
145 150 155 160 

Tyr Trp Ala Thr Arg Asn Tyr Arg Glu Ala Tyr Gly Leu Phe Ala Val 
165 170 175 

Asn Gly He Leu Phe Asn His Glu Ser Pro Arg Arg Gly Glu Thr Phe 
180 185 190 

Val Thr Arg Lys He Thr Arg Ala Val Ala Arg He Lys Ala Gly He 
195 200 205 

Gin Ser Glu Val Tyr Met Gly Asn Leu Asp Ala Val Arg Asp Trp Gly 
210 215 220 

Tyr Ala Pro Glu Tyr Val Glu Gly Met Trp Arg Met Leu Gin Thr Asp 
225 230 235 240 

Glu Pro Asp Asp Phe Val Leu Ala Thr Gly Arg Gly Phe Thr Val Arg 
245 250 255 

Glu Phe Ala Arg Ala Ala Phe Glu His Ala Gly Leu Asp Trp Gin Gin 
260 265 270 

Tyr Val Lys Phe Asp Gin Arg Tyr Leu Arg Pro Thr Glu Val Asp Ser 
275 280 285 

Leu He Gly Asp Ala Thr Lys Ala Ala Glu Leu Leu Gly Trp Arg Ala 
290 295 300 

Ser Val His Thr Asp Glu Leu Ala Arg He Met Val Asp Ala Asp Met 
3 °5 310 315 320 

Ala Ala Leu Glu Cys Glu Gly Lys Pro Trp lie Asp Lys Pro Met He 
325 330 335 

Ala Gly Arg Thr 
340 



<210> 34 
<211> 732 
<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> (1) . . (729) 

<400> 34 

atg agg ctg gcc cgt cgc get egg 
Met Arg Leu Ala Arg Arg Ala Arg 
1 5 

gag gtg teg cgc tac ttt gcc gaa 
Glu Val Ser Arg Tyr Phe Ala Glu 
20 



aac ate ttg cgt cgc aac ggc ate 4 8 
Asn He Leu Arg Arg Asn Gly He 
10 15 

ctg gac tgg gaa cgc aat ttc ttg 96 
Leu Asp Trp Glu Arg Asn Phe Leu 
25 30 
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cgc caa ctg caa teg cat egg gtc agt gec gtg etc gat gtc ggg gec 144 
Arg Gin Leu Gin Ser His Arg Val Ser Ala Val Leu Asp Val Gly Ala 
35 40 45 

aat teg ggg cag tac gee agg ggt ctg cgc ggc gcg ggc ttc gcg ggc 192 
Asn Ser Gly Gin Tyr Ala Arg Gly Leu Arg Gly Ala Gly Phe Ala Gly 
50 55 60 

cgc ate gtc teg ttc gag ccg ctg ccc ggg ccc ttt gec gtc ttg cag 240 
Arg He Val Ser Phe Glu Pro Leu Pro Gly Pro Phe Ala Val Leu Gin 
65 70 75 80 

cgc age gec tec acg gac ccg ttg tgg gaa tgc egg cgc tgt gcg ctg 288 
Arg Ser Ala Ser Thr Asp Pro Leu Trp Glu Cys Arg Arg Cys Ala Leu 
85 90 95 

ggc gat gtc gat gga ace ate teg ate aac gtc gee ggc aac gag ggc 336 
Gly Asp Val Asp Gly Thr He Ser He Asn Val Ala Gly Asn Glu Gly 
100 105 110 

gec age agt tec gtc ttg ccg atg ttg aaa cga cat cag gac gee ttt 384 
Ala Ser Ser Ser Val Leu Pro Met Leu Lys Arg His Gin Asp Ala Phe 
115 120 125 

cca cca gec aac tac gtg ggc gec caa egg gtg ccg ata car cga etc 432 
Pro Pro Ala Asn Tyr Val Gly Ala Gin Arg Val Pro He His Arg Leu 
130 135 140 

gat tec gtg get gca gac gtt ctg egg ccc aac gat att gcg ttc ttg 480 
Asp Ser Val Ala Ala Asp Val Leu Arg Pro Asn Asp He Ala Phe Leu 
145 150 155 160 

aag ate gac gtt caa gga ttc gag aag cag gtg ate gcg ggt ggc gat 528 
Lys lie Asp Val Gin Gly Phe Glu Lys Gin Val lie Ala Gly Gly Asp 
165 170 175 

tea acg gtg cac gac cga tgc gtc ggc atg cag etc gag ctg tct ttc 576 
Ser Thr Val His Asp Arg Cys Val Gly Met Gin Leu Glu Leu Ser Phe 
180 185 190 

cag ccg ttg tac gag ggt ggc atg etc ate cgc gag gcg etc gat etc 624 
Gin Pro Leu Tyr Glu Gly Gly Met Leu lie Arg Glu Ala Leu Asp Leu 
195 200 205 

gtg gat teg ttg ggc ttt acg etc teg gga ttg caa ccc ggt ttc ace 672 
Val Asp Ser Leu Gly Phe Thr Leu Ser Gly Leu Gin Pro Gly Phe Thr 
210 215 220 

gac ccc cgc aac ggt cga atg ctg cag gec gat ggc ate ttc ttc egg 720 
Asp Pro Arg Asn Gly Arg Met Leu Gin Ala Asp Gly lie Phe Phe Arg 
225 230 235 240 

ggc age gat tga 732 
Gly Ser Asp 



<210> 35 
<211> 243 
<212> PRT 

<213> Mycobacterium 
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<400> 35 

Met Arg Leu Ala Arg Arg Ala Arg Asn He Leu Arg Arg Asn Gly He 
15 10 15 

Glu Val Ser Arg Tyr Phe Ala Glu Leu Asp Trp Glu Arg Asn Phe Leu 
20 25 30 

Arg Gin Leu Gin Ser His Arg Val Ser Ala Val Leu Asp Val Gly Ala 
35 40 45 

Asn Ser Gly Gin Tyr Ala Arg Gly Leu Arg Gly Ala Gly Phe Ala Gly 
50 55 60 

Arg He Val Ser Phe Glu Pro Leu Pro Gly Pro Phe Ala Val Leu Gin 
65 70 75 80 

Arg Ser Ala Ser Thr Asp Pro Leu Trp Glu Cys Arg Arg Cys Ala Leu 
85 90 95 

Gly Asp Val Asp Gly Thr He Ser He Asn Val Ala Gly Asn Glu Gly 
100 105 HO 

Ala Ser Ser Ser Val Leu Pro Met Leu Lys Arg His Gin Asp Ala Phe 
115 120 125 

Pro Pro Ala Asn Tyr Val Gly Ala Gin Arg Val Pro He His Arg Leu 
130 135 140 

Asp Ser Val Ala Ala Asp Val Leu Arg Pro Asn Asp He Ala Phe Leu 
145 150 155 160 

Lys He Asp Val Gin Gly Phe Glu Lys Gin Val He Ala Gly Gly Asp 
165 170 175 

Ser Thr Val His Asp Arg Cys Val Gly Met Gin Leu Glu Leu Ser Phe 
180 185 190 

Gin Pro Leu Tyr Glu Gly Gly Met Leu He Arg Glu Ala Leu Asp Leu 
195 200 205 

Val Asp Ser Leu Gly Phe Thr Leu Ser Gly Leu Gin Pro Gly Phe Thr 
210 215 220 

Asp Pro Arg Asn Gly Arg Met Leu Gin Ala Asp Gly He Phe Phe Arg 
22 5 230 235 240 



Gly' Ser Asp 



<210> 36 
<211> 732 
<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> (1) . . (729) 
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<400> 36 

gtg aaa teg ttg aaa etc get cgt ttc ate gcg cgt age gee gee ttc 48 

Met Lys Ser Leu Lys Leu Ala Arg Phe lie Ala Arg Ser Ala Ala Phe 
15 10 15 

gag gtt teg cgc cgc tat tct gag cga gac ctg aag cac cag ttt gtg 96 

Glu Val Ser Arg Arg Tyr Ser Glu Arg Asp Leu Lys His Gin Phe Val 
20 25 30 

aag caa etc aaa teg cgt egg gta gat gtc gtt ttc gat gtc ggc gee 144 

Lys Gin Leu Lys Ser Arg Arg Val Asp Val Val Phe Asp Val Gly Ala 
35 . 40 45 

aac tea gga caa tac gec gec ggc etc cgc cga gca gca tat aag ggc 192 

Asn Ser Gly Gin Tyr Ala Ala Gly Leu Arg Arg Ala Ala Tyr Lys Gly 
50 55 60 

cgc att gtc teg ttc gaa ccg eta tec gga ccg ttt acg ate ttg gaa 240 

Arg lie Val Ser Phe Glu Pro Leu Ser Gly Pro Phe Thr lie Leu Glu 
65 70 75 80 

age aaa gcg tea acg gat cca ctt tgg gat tgc egg cag cat gcg ttg 288 

Ser Lys Ala Ser Thr Asp Pro Leu Trp Asp Cys Arg Gin His Ala Leu 
85 90 95 

ggc gat tct gat gga acg gtt acg ate aat ate gca gga aac gec ggt 336 

Gly Asp Ser Asp Gly Thr Val Thr He Asn He Ala Gly Asn Ala Gly 
100 105 110 

cag age agt tec gtc ttg ccc atg ctg aaa agt cat cag aac get ttt 384 

Gin Ser Ser Ser Val Leu Pro Met Leu Lys Ser His Gin Asn Ala Phe 
115 120 125 

ccc ccg gca aac tat gtc ggt acc caa gag gcg tec ata cat cga ctt 432 

Pro Pro Ala Asn Tyr Val Gly Thr Gin Glu Ala Ser He His Arg Leu 
130 135 140 

gat tec gtg gcg cca gaa ttt eta ggc arg aac ggt gtc get ttt etc 480 

Asp Ser Val Ala Pro Glu Phe Leu Gly Met Asn Gly Val Ala Phe Leu 

145 150 155 160 

aag gtc gac gtt caa ggc ttt gaa aag cag gtg etc gee ggg ggc aaa 528 

Lys Val Asp Val Gin Gly Phe Glu Lys Gin Val Leu Ala Gly Gly Lys 
165 170 175 

tea acc ata gat gac cat tgc gtc ggc atg caa etc gaa ctg tec ttc 576 

Ser Thr He Asp Asp His Cys Val Gly Met Gin Leu Glu Leu Ser Phe 
180 185 190 

ctg ccg ttg tac gaa ggt ggc atg etc att cct gaa gec etc gat etc 624 

Leu Pro Leu Tyr Glu Gly Gly Met Leu He Pro Glu Ala Leu Asp Leu 
195 200 205 

gtg tat tec ttg ggc ttc acg ttg acg gga ttg ctg cct tgt ttc att 672 

Val Tyr Ser Leu Gly Phe Thr Leu Thr Gly Leu Leu Pro Cys Phe He 
210 215 220 

gat gca aat aat ggt cga atg ttg cag gee gac ggc ate ttt ttc cgc 720 

Asp Ala Asn Asn Gly Arg Met Leu Gin Ala Asp Gly He Phe Phe Arg 

225 230 235 240 
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gag gac gat tga 
Glu Asp Asp 



732 



<210> 37 
<211> 243 
<212> PRT 

<213> Mycobacterium 
<400> 37 

Met Lys Ser Leu Lys Leu Ala Arg Phe lie Ala Arg Ser Ala Ala Phe 
15 10 15 

Glu Val Ser Arg Arg Tyr Ser Glu Arg Asp Leu Lys His Gin Phe Val 
20 25 30 

Lys Gin Leu Lys Ser Arg Arg Val Asp Val Val Phe Asp Val Gly Ala 
35 40 45 

Asn Ser Gly Gin Tyr Ala Ala Gly Leu Arg Arg Ala Ala Tyr Lys Gly 
50 55 60 

Arg He Val Ser Phe Glu Pro Leu Ser Gly Pro Phe Thr He Leu Glu 
65 70 75 80 

Ser Lys Ala Ser Thr Asp Pro Leu Trp Asp Cys Arg Gin His Ala Leu 
85 * 90 95 

Gly Asp Ser Asp Gly Thr Val Thr lie Asn He Ala Gly Asn Ala Gly 
100 105 110 

Gin Ser Ser Ser Val Leu Pro Met Leu Lys Ser His Gin Asn Ala Phe 
115 120 125 

Pro Pro Ala Asn Tyr Val Gly Thr Gin Glu Ala Ser He His Arg Leu 
130 135 140 

Asp Ser Val Ala Pro Glu Phe Leu Gly Met Asn Gly Val Ala Phe Leu 
145 150 155 160 

Lys Val Asp Val Gin Gly Phe Glu Lys Gin Val Leu Ala Gly Gly Lys 
165 170 175 

Ser Thr lie Asp Asp His Cys Val Gly Met Gin Leu Glu Leu Ser Phe 
180 185 190 

Leu Pro Leu Tyr Glu Gly Gly Met Leu He Pro Glu Ala Leu Asp Leu 
195 200 205 

Val Tyr Ser Leu Gly Phe Thr Leu Thr Gly Leu Leu Pro Cys Phe He 
210 215 220 

Asp Ala Asn Asn Gly Arg Met Leu Gin Ala Asp Gly He Phe Phe Arg 
225 230 235 240 

Glu Asp Asp 



<210> 38 
<211> 828 
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<212> DNA 

<213> Mycobacterium 

<220> 

<221> CDS 

<222> (1) . . (825) 



<400> 38 

atg gtg cag acg aaa cga tac gcc ggc ttg acc gca get aac aca aag 
Met Val Gin Thr Lys Arg Tyr Ala Gly Leu Thr Ala Ala Asn Thr Lys 
15 10 15 

aaa gtc gcc atg gcc gca cca atg ttt teg ate ate ate ccc acc ttg 
Lys Val Ala Met Ala Ala Pro Met Phe Ser He He He Pro Thr Leu 
20 25 30 



acc tgc ggt gac ttc gag ctg gta ctg gtc gac ggc ggc teg acg gac 
Thr Cys Gly Asp Phe Glu Leu Val Leu Val Asp Gly Gly Ser Thr Asp 
50 55 60 



ttg ate att cat cgc gac acc gac cag ggc gtc tac gac gcc atg aac 
Leu He He His Arg Asp Thr Asp Gin Gly Val Tyr Asp Ala Met Asn 
85 90 95 

cgc ggc gtg gac ctg gcc acc gga acg tgg ttg etc ttt ctg ggc gcg 
Arg Gly Val Asp Leu Ala Thr Gly Thr Trp Leu Leu Phe Leu Gly Ala 
100 105 HO 

gac gac age ctg tac gag get gac acc ctg gcg egg gtg gcc gcc ttc 
Asp Asp Ser Leu Tyr Glu Ala Asp Thr Leu Ala Arg Val Ala Ala Phe 
115 120 125 

att ggc gaa cac gag ccc age gat ctg gta tat ggc gac gtg ate atg 
lie Gly Glu His Glu Pro Ser Asp Leu Val Tyr Gly Asp Val He Met 
130 135 140 

cgc tea acc aat ttc cgc tgg ggr ggc gcc ttc gac etc gac cgt ctg 
Arg Ser Thr Asn Phe Arg Trp Gly Gly Ala Phe Asp Leu Asp Arg Leu 
145 150 155 160 

ttg ttc aag cgc aac ate tgc cat cag gcg ate ttc tac cgc cgc gga 
Leu Phe Lys Arg Asn He Cys His Gin Ala He Phe Tyr Arg Arg Gly 
165 170 175 

etc ttc ggc acc ate ggt ccc tac aac etc cgc tac egg gtc ctg gcc 
Leu Phe Gly Thr He Gly Pro Tyr Asn Leu Arg Tyr Arg Val Leu Ala 
180 185 190 

gac tgg gac ttc aat att cgc tgc ttt tec aac cca gcg etc gtc acc 
Asp Trp Asp Phe Asn lie Arg Cys Phe Ser Asn Pro Ala Leu Val Thr 
195 200 205 
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96 



aac gtg get gcg gta ttg cct gcc tgc etc gac age ate gcc cgt cag 144 
Asn Val Ala Ala Val Leu Pro Ala Cys Leu Asp Ser lie Ala Arg Gin 
35 40 45 



192 



gaa acc etc gac ate gcc aac att ttc gcc ccc aac etc ggc gag egg 240 
Glu Thr Leu Asp lie Ala Asn lie Phe Ala Pro Asn Leu Gly Glu Arg 
65 70 75 80 



288 



336 



384 



432 



480 



528 



576 



624 



50 



cgc tac atg cac gtg gtc gtt gca age tac aac gaa ttc ggc ggg etc 672 
Arg Tyr Met His Val Val Val Ala Ser Tyr Asn Glu Phe Gly Gly Leu 
210 215 220 



age aat acg ate gtc gac aag gag ttt ttg aag egg ctg ccg atg tec 
Ser Asn Thr lie Val Asp Lys Glu Phe Leu Lys Arg Leu Pro Met Ser 
225 230 235 240 

acg aga etc ggc ata agg ctg gtc ata gtt ctg gtg cgc agg tgg cca 
Thr Arg Leu Gly He Arg Leu Val He Val Leu Val Arg Arg Trp Pro 
245 250 255 

aag gtg ate age agg gee atg gta atg cgc acc gtc att tct tgg egg 
Lys Val He Ser Arg Ala Met Val Met Arg Thr Val He Ser Trp Arg 
260 265 270 

cgc cga cgt tag 
Arg Arg Arg 
275 



720 



768 



816 



828 



<210> 39 
<211> 275 
<212> PRT 

<213> Mycobacterium 
<400> 39 

Met Val Gin Thr Lys Arg Tyr Ala Gly Leu Thr Ala Ala Asn Thr Lys 
1 5 10 15 

Lys Val Ala Met Ala Ala Pro Met Phe Ser He He He Pro Thr Leu 
20 25 30 

Asn Val Ala Ala Val Leu Pro Ala Cys Leu Asp Ser He Ala Arg Gin 
35 40 45 

Thr Cys Gly Asp Phe Glu Leu Val Leu Val Asp Gly Gly Ser Thr Asp 
50 55 60 

Glu Thr Leu Asp He Ala Asn He Phe Ala Pro Asn Leu Gly Glu Arg 
65 70 75 80 

Leu He He His Arg Asp Thr Asp Gin Gly Val Tyr Asp Ala Met Asn 
85 90 95 

Arg Gly Val Asp Leu Ala Thr Gly Thr Trp Leu Leu Phe Leu Gly Ala 
100 105 HO 

Asp Asp Ser Leu Tyr Glu Ala Asp Thr Leu Ala Arg Val Ala Ala Phe 
115 120 125 

He Gly Glu His Glu Pro Ser Asp Leu Val Tyr Gly Asp Val He Met 
130 135 140 

Arg Ser Thr Asn Phe Arg Trp Gly Gly Ala Phe Asp Leu Asp Arg Leu 
145 150 155 160 

Leu Phe Lys Arg Asn He Cys His Gin Ala He Phe Tyr Arg Arg Gly 
165 170 175 



51 



Leu Phe Gly Thr He Gly Pro Tyr 
180 

Asp Trp Asp Phe Asn He Arg Cys 

195 200 

Arg Tyr Met His Val Val Val Ala 
210 215 

Ser Asn Thr He Val Asp Lys Glu 
225 230 

Thr Arg Leu Gly He Arg Leu Val 
245 

Lys Val He Ser Arg Ala Met Val 
260 

Arg Arg Arg 
275 



Asn Leu Arg Tyr Arg Val Leu Ala 
185 190 

Phe Ser Asn Pro Ala Leu Val Thr 
205 

Ser Tyr Asn Glu Phe Gly Gly Leu 
220 

Phe Leu Lys Arg Leu Pro Met Ser 

235 240 

He Val Leu Val Arg Arg Trp Pro 
250 255 

Met Arg Thr Val lie Ser Trp Arg 
265 270 



<210> 40 
<211> 24 
<212> DNA 

<213> Mycobacterium 
<400> 40 

gatgccgtga ggaggtaaag ctgc 



<210> 41 
<211> 24 
<212> DNA 

<213> Mycobacterium 



<400> 41 

gatacggctc ttgaatcctg cacg 
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